Bug 1313011 - User2 unable to run retrace-server-interact due to group permissions problem created by user1 on crash_cmd file
User2 unable to run retrace-server-interact due to group permissions problem ...
Status: CLOSED CURRENTRELEASE
Product: Fedora EPEL
Classification: Fedora
Component: retrace-server (Show other bugs)
el6
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Dave Wysochanski
Fedora Extras Quality Assurance
: Regression, Reopened
: 1195786 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-29 12:25 EST by Dave Wysochanski
Modified: 2017-04-11 16:26 EDT (History)
4 users (show)

See Also:
Fixed In Version: retrace-server-1.15-2.fc23 retrace-server-1.15-2.el6 retrace-server-1.15-2.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-04-11 16:26:58 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
WIP patch for resolving crash_cmd issue - setuid / setgid to retrace:CONFIG["AuthGroup"] (2.62 KB, patch)
2016-03-04 06:11 EST, Dave Wysochanski
no flags Details | Diff
patch to fix this bug (1.97 KB, application/octet-stream)
2016-03-04 19:24 EST, Dave Wysochanski
no flags Details
patch to fix this bug (2.64 KB, application/octet-stream)
2016-03-04 19:24 EST, Dave Wysochanski
no flags Details
updated patch to fix this bug - make chown/chgrp best effort (2.29 KB, patch)
2016-03-05 05:47 EST, Dave Wysochanski
no flags Details | Diff
updated patch to fix this bug - make chown/chgrp best effort (2.89 KB, patch)
2016-03-05 05:47 EST, Dave Wysochanski
no flags Details | Diff

  None (edit)
Description Dave Wysochanski 2016-02-29 12:25:35 EST
Description of problem:
Didn't notice this one until it was deployed into production unfortunately.

For an existing task, if user1 runs "retrace-server-interact 980813160 crash", it will create the 'crash_cmd' file just fine.  However a second user running the same command will get this:

$ retrace-server-interact 980813160 crash
Traceback (most recent call last):
  File "/usr/bin/retrace-server-interact", line 119, in <module>
    task.set_crash_cmd(' '.join(crash_cmd))
  File "/usr/lib/python2.6/site-packages/retrace/retrace.py", line 2173, in set_crash_cmd
    self.set(RetraceTask.CRASH_CMD_FILE, data)
  File "/usr/lib/python2.6/site-packages/retrace/retrace.py", line 1607, in set
    with open(self._get_file_path(key), mode) as f:
IOError: [Errno 13] Permission denied: '/cores/retrace/tasks/980813160/crash_cmd'
$ ls -lh /cores/retrace/tasks/980813160/crash_cmd
-rw-rw-r--. 1 user1 user1 5 Feb 29 09:29 /cores/retrace/tasks/980813160/crash_cmd

This is due to the fact that the task directory does not have setgid bit set, and for some reason everytime we run 'crash' we're trying to re-write that crash_cmd file.  I think this is due to kernel version detection getting called from that code path.


Version-Release number of selected component (if applicable):
retrace-server-1.14-2.el6.noarch


How reproducible:
Everytime on an existing vmcore.


Steps to Reproduce:
1. User1 runs retrace-server-interact <taskid> crash - this creates the 'crash_cmd' file with user's group permissions
2. User2 tries to run retrace-server-interact <taskid> crash - this backtraces due to being unable to write the 'crash_cmd' file


Actual results:
user2 unable to open the vmcore

Expected results:
user2 able to open the vmcore


Additional info:
This is due to a couple things:
1. There is no setgid bit set for each 'taskid' directory
2. The prepare_debuginfo may change the 'crash_cmd' so we need to write it after running prepare_debuginfo
3. Everytime any use calls 'retrace-server-interact <taskid> crash' we call prepare_debuginfo.

Here is the affected code.
/usr/bin/retrace-server-interact:
110 
111         hostarch = os.uname()[4]
112         if hostarch in ["i486", "i586", "i686"]:
113             hostarch = "i386"
114 
115         if args.action == "crash":
116             if not task.use_mock(kernelver):
117                 crash_cmd = task.get_crash_cmd().split()
118                 vmlinux = prepare_debuginfo(vmcore, kernelver=kernelver, crash_cmd=crash_cmd)
119-->                task.set_crash_cmd(' '.join(crash_cmd))
120                 if task.has_crashrc():
121                     cmdline = task.get_crash_cmd().split() + ["-i", task.get_crashrc_path(), vmcore, vmlinux]
122                 else:
123                     cmdline = task.get_crash_cmd().split() + [vmcore, vmlinux]
124             else:
125                 cfgdir = os.path.join(CONFIG["SaveDir"], "%d-kernel" % task.get_taskid())
126                 vmlinux = prepare_debuginfo(vmcore, chroot=cfgdir, kernelver=kernelver, crash_cmd=task.get_crash_cmd().split())
127                 if task.has_crashrc():
128                     cmdline = ["/usr/bin/mock", "--configdir", cfgdir,


Will need to look at possible solutions.  One thought brought up was making the task directory setgid AuthGroup.  However, I'll have to look closer at this since it's not obvious to me why we'd want to write that file again if nothing's changed, or even why we're calling prepare_debuginfo unconditionally (maybe we need the vmlinux path?).  It seems like detection may be broken here since we should not be calling prepare_debuginfo but one time at kernel version detection time.  Once we have the kernelver we shouldn't call that again at least I don't think we should.
Comment 1 Dave Wysochanski 2016-02-29 20:26:58 EST
The more I think about it the more I think having any file inside a 'tasks' directory with user != retrace or group != AuthGroup is probably wrong.  Maybe the only way to do that is the setgid / setuid bits on the directory.
Comment 4 Dave Wysochanski 2016-03-04 05:33:36 EST
This is actually a much bigger problem than I thought.  We have a regression in the retrace-server-worker --restart command, as well as the problem with the crash_cmd file.

Now we get the following backtrace if we try a retrace-server-worker --restart.  Something changed in the management of the retrace_log file in the latest retrace-server-1.14-2.el6.noarch

$ retrace-server-worker --restart 933538188
Traceback (most recent call last):
  File "/usr/bin/retrace-server-worker", line 28, in <module>
    worker.begin_logging()
  File "/usr/lib/python2.6/site-packages/retrace/retrace_worker.py", line 17, in begin_logging
    self.task._get_file_path(RetraceTask.LOG_FILE))
  File "/usr/lib64/python2.6/logging/__init__.py", line 827, in __init__
    StreamHandler.__init__(self, self._open())
  File "/usr/lib64/python2.6/logging/__init__.py", line 846, in _open
    stream = open(self.baseFilename, self.mode)
IOError: [Errno 13] Permission denied: '/retrace/tasks/933538188/retrace_log'
Comment 5 Dave Wysochanski 2016-03-04 06:11:27 EST
Created attachment 1133142 [details]
WIP patch for resolving crash_cmd issue - setuid / setgid to retrace:CONFIG["AuthGroup"]
Comment 6 Dave Wysochanski 2016-03-04 06:13:54 EST
Comment on attachment 1133142 [details]
WIP patch for resolving crash_cmd issue - setuid / setgid to retrace:CONFIG["AuthGroup"]

Experimental patch to resolve this issue
Comment 7 Dave Wysochanski 2016-03-04 09:39:07 EST
Previous patch posted obviously doesn't work but if we could do that it would be ideal.

Confirmed that setgid on the directory solves the original problem with crash_cmd on existing tasks.  I'm not sure that's the greatest solution though.

We still have the issue with the worker restart backtrace though so we may need to do significant refactoring and add a helper to write files in the retrace directory, not sure.
Comment 8 Dave Wysochanski 2016-03-04 14:26:26 EST
Actually the retrace-server-worker --restart bug looks like it will require a much different / unrelated fix so I've renamed this one back to its original purpose and opened a different bug for the --restart:
https://bugzilla.redhat.com/show_bug.cgi?id=1314897
Comment 9 Dave Wysochanski 2016-03-04 17:16:53 EST
Note that this bug _only_ affects existing tasks that were created prior to retrace-server-1.14-2 due to the fact that the crash_cmd file was not created during the retrace of the vmcore.  It is thus really a problem only present during upgrading, and if someone re-creates the crash_cmd file with improper group ownership.

That said, there is a lot of problems with ownership of files in 'SaveDir' if a user runs interactive mode commands.  For example, I just found it's even possible to take down the 'manager' interface if a user issues --restart of a task and the 'other' permission is not readable on some file required for the 'manager' page.  To prevent such happenings we probably need setgid on the SaveDir or some serious refactoring.  This is not a new issue though and it's only become critical due to the 'crash_cmd' file which doesn't exist on old tasks.

We could just defer this bug and just prior to the upgrade / outage, do the following:
1. Manually create a default "crash_cmd" file for all tasks and with proper retrace:CONFIG["AuthGroup"] perms
2. Upgrade to retrace-server-1.14-2
3. Check and make sure existing tasks can be loaded by multiple users
4. Check to make sure new tasks can be loaded by multiple users
5. Check that retrace-server-worker --restart works

Unfortunately the problem is step #5 will create the same scenario that exists in this bug -- crash_cmd will contain user1's perms and user2 won't be able to access the vmcore.  This is due to the fact that --restart removes 'crash_cmd' as well as other task state files, as well as the fact we're re-writing the crash_cmd file everytime we issue retrace-server-interact.

I may try to fix this last part - that is, I don't think we should be rewriting crash_cmd everytime we issue retrace-server-interact.  However fixing this means dealing with where we obtain the vmlinux file, i.e. prepare_debuginfo, which is a whole other mess...
Comment 10 Dave Wysochanski 2016-03-04 19:24:49 EST
Created attachment 1133267 [details]
patch to fix this bug
Comment 11 Dave Wysochanski 2016-03-04 19:24:51 EST
Created attachment 1133268 [details]
patch to fix this bug
Comment 12 Dave Wysochanski 2016-03-04 19:26:51 EST
Posted pull request on https://github.com/abrt/retrace-server
Comment 13 Dave Wysochanski 2016-03-05 05:47:13 EST
Created attachment 1133306 [details]
updated patch to fix this bug - make chown/chgrp best effort
Comment 14 Dave Wysochanski 2016-03-05 05:47:15 EST
Created attachment 1133307 [details]
updated patch to fix this bug - make chown/chgrp best effort
Comment 15 Dave Wysochanski 2016-03-19 19:27:50 EDT
*** Bug 1195786 has been marked as a duplicate of this bug. ***
Comment 16 Fedora Update System 2016-03-21 08:23:43 EDT
retrace-server-1.15-1.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-88382c37cc
Comment 17 Fedora Update System 2016-03-21 08:23:54 EDT
retrace-server-1.15-1.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-3fd6c52337
Comment 18 Fedora Update System 2016-03-21 08:24:03 EDT
retrace-server-1.15-1.el6 has been submitted as an update to Fedora EPEL 6. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-1082131b97
Comment 19 Fedora Update System 2016-03-21 08:24:11 EDT
retrace-server-1.15-1.el7 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-505f2e5c59
Comment 20 Fedora Update System 2016-03-21 18:31:21 EDT
retrace-server-1.15-1.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-3fd6c52337
Comment 21 Fedora Update System 2016-03-21 20:26:31 EDT
retrace-server-1.15-1.el7 has been pushed to the Fedora EPEL 7 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-505f2e5c59
Comment 22 Fedora Update System 2016-03-22 02:49:19 EDT
retrace-server-1.15-1.el6 has been pushed to the Fedora EPEL 6 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-1082131b97
Comment 23 Fedora Update System 2016-03-22 11:23:00 EDT
retrace-server-1.15-1.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-88382c37cc
Comment 24 Fedora Update System 2016-03-22 12:43:57 EDT
retrace-server-1.15-2.el6 has been submitted as an update to Fedora EPEL 6. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-1082131b97
Comment 25 Fedora Update System 2016-03-23 04:57:04 EDT
retrace-server-1.15-2.el7 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-505f2e5c59
Comment 26 Fedora Update System 2016-03-23 04:57:15 EDT
retrace-server-1.15-2.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-3fd6c52337
Comment 27 Fedora Update System 2016-03-23 04:58:44 EDT
retrace-server-1.15-2.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-88382c37cc
Comment 28 Fedora Update System 2016-03-23 15:57:00 EDT
retrace-server-1.15-2.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-3fd6c52337
Comment 29 Fedora Update System 2016-03-23 21:53:41 EDT
retrace-server-1.15-2.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-88382c37cc
Comment 30 Fedora Update System 2016-03-24 11:49:43 EDT
retrace-server-1.15-2.el6 has been pushed to the Fedora EPEL 6 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-1082131b97
Comment 31 Fedora Update System 2016-03-24 11:50:24 EDT
retrace-server-1.15-2.el7 has been pushed to the Fedora EPEL 7 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-505f2e5c59
Comment 32 Fedora Update System 2016-04-05 09:54:08 EDT
retrace-server-1.15-2.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.
Comment 33 Fedora Update System 2016-04-05 12:21:15 EDT
retrace-server-1.15-2.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.
Comment 34 Fedora Update System 2016-04-08 17:25:12 EDT
retrace-server-1.15-2.el6 has been pushed to the Fedora EPEL 6 stable repository. If problems still persist, please make note of it in this bug report.
Comment 35 Fedora Update System 2016-04-08 17:31:15 EDT
retrace-server-1.15-2.el7 has been pushed to the Fedora EPEL 7 stable repository. If problems still persist, please make note of it in this bug report.
Comment 36 Fedora Update System 2016-06-03 05:46:25 EDT
retrace-server-1.16-1.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-81eb6f786a
Comment 37 Fedora Update System 2016-06-04 14:26:33 EDT
retrace-server-1.16-1.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-81eb6f786a
Comment 38 Dave Wysochanski 2016-06-07 07:12:13 EDT
I verified this has been fixed in retrace-server-1.15-1.el6
Comment 39 Fedora Update System 2016-07-01 02:19:58 EDT
retrace-server-1.16-2.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-81eb6f786a
Comment 40 Fedora Update System 2016-07-02 16:31:30 EDT
retrace-server-1.16-2.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-81eb6f786a

Note You need to log in before you can comment on or make changes to this bug.