Bug 1062706 - Take procedure is worse than in previous versions of Beaker
Summary: Take procedure is worse than in previous versions of Beaker
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Beaker
Classification: Retired
Component: web UI
Version: 0.15
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: 19.0
Assignee: Dan Callaghan
QA Contact: tools-bugs
URL:
Whiteboard:
Depends On: 1014438
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-02-07 18:31 UTC by Prarit Bhargava
Modified: 2018-02-06 00:41 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-11-25 07:18:10 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 790492 0 unspecified CLOSED [RFE] Provide a simple mechanism for admins to check a system is working properly 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 851354 0 unspecified CLOSED RFE: Allow scheduling of jobs against Manual and Broken systems 2021-02-22 00:41:40 UTC

Internal Links: 790492 851354

Description Prarit Bhargava 2014-02-07 18:31:34 UTC
Description of problem: Previously in beaker to "Take" a machine, I could just click one button that was AFAIU was based on the ACLs.

Now I have to (this assumes that "Loan to Self" has been set on a particular system)

a) Click the loan settings button
b) type in my user name in the new window pop up
c) click the update loan button to close the new window pop up
d) reload the webpage
e) click the new take button

This is an extreme annoyance and a overly-complex procedure for something that should (not _could_ be) a one button feature.

Version-Release number of selected component (if applicable): 0.1.53


How reproducible: 100%


Steps to Reproduce: See above.

Actual results: See above

Comment 2 Jiri Hladky 2014-02-07 20:11:47 UTC
Hello,

this affects our team as well. The situation is even worse when you don't have rights to loan the system to yourself. In that case the only option is to 'Schedule the provision". Even if the system is free it takes tens of minutes (now waiting already for 18 minutes) till the job is processed by Beaker and started. And if anything goes wrong during the installation the job gets aborted and system goes to another user keeping no options how to debug the problem via serial console.

In summary, without TAKE functionality:
- it takes much longer to provision of the system because now job is created and it takes time before it's processed by the system
- it's impossible to debug the installation in the serial console. If the job is aborted the system goes back to pool 

Could you please enable the TAKE button in the current version of UI?

Thanks a lot
Jirka

Comment 3 xjia 2014-02-08 02:20:19 UTC
I think It's related to BZ#1062529 BZ#1062469 BZ#1062480.

Comment 4 Jiri Hladky 2014-02-09 21:16:08 UTC
Hi

I'm sorry but I don't see any relation to BZ#1062529 BZ#1062469 BZ#1062480.

Those 3 BZs are describing performance issues. 

In contract, this BZ deals with removed functionality - the TAKE button to take the system is no longer available in the current WEB UI.

The TAKE button used to be on the details page:

https://beaker.engineering.redhat.com/view/karkulka.lab.eng.brq.redhat.com#details

next to "Current user" field and provided the functionality to become the current user of the box. Could we please get this button back?

Thanks
Jirka

Comment 5 Nick Coghlan 2014-02-10 00:32:51 UTC
This is a deliberate UI change in Beaker 0.15 to prevent destructive interference with currently running automated jobs: a system must be either in Manual mode or explicitly loaned to the logged in user before manual reservation is permitted through the system page.

http://beaker-project.org/docs/whats-new/release-0.15.html#clarified-take-schedule-provision-and-provision-in-the-web-ui

The restriction to only allow manual reservation of non-automated systems was introduced in #855333 and adjusted to also allow manual reservation of loaned automated systems in #1015131.

We're aware this interim fix creates workflow issues for certain use cases, and would appreciate end user feedback on the proposed full fix currently planned for Beaker 0.17:

http://beaker-project.org/dev/proposals/system-page-improvements.html#system-page-improvements

Comment 6 Jiri Hladky 2014-02-10 10:33:27 UTC
Hi Nick,

thanks a lot for the clarification. Could you please provide a link to the documentation describing the difference between Manual and Automated mode? In particular, can you schedule Beaker jobs via 

bkr job-submit <xml_file>

when system is in Manual mode?

I would still *strongly* advocate to get the TAKE button back when system is FREE. 

The reason to remove the TAKE button described here 
http://beaker-project.org/docs/whats-new/release-0.15.html#clarified-take-schedule-provision-and-provision-in-the-web-ui

is
"The Take button on the system page no longer appears by default for systems set to Automated, as this was a common source of confusion for new users, and could result in users accidentally interrupting a running job."

The Take button is extremely useful when dealing with installation problems. Often, it just a silly bug in Anaconda which requires some user input and can be provided via serial console. However, you need the Take button for that. When you schedule the provision, it takes time till the job is started and you cannot really follow the serial console all the time. Even worse, when installation time-outs, the system is gone. You really need Take button for this.

For this reason I would argue that TAKE button should be available whenever a system is FREE (no job is running on it). This will eliminate the problem when a novice user could accidentally kill a running job by inappropriate using of the Take button and the same time give us this extremely useful functionality back.

Please let me know your opinion on that.

Thanks a lot
Jirka

Comment 7 Nick Coghlan 2014-02-10 23:56:44 UTC
Note that in Beaker 0.15.3+, systems with monitored console logs that fail during installation (including prompting for user input) *will fail* - they won't stall waiting for the external watchdog to time out.

Even without that change (which was explicitly requested by Platform QE in bug 952661), I don't see how restoring the "Take" button for free systems in Automated mode would help - if Beaker is running an installation, then it has scheduled a job and the system isn't going to be free.

Are the systems in question here generally selected using a host requirement that names a specific system, rather than an arbitrary system that provides certain hardware? If so, then it may be better to address the limitation discussed in bug 851354 and 790492, where jobs can't currently be scheduled against systems in Manual or Broken mode.

The proposed solution there is to provide a way for a user to tell the scheduler "I know this system isn't in Automated mode, schedule the job anyway". However, it would only handle cases where the user is requesting a specific machine, not ones where they're submitting requests like "give me any x86_64 machine with 2+ TB of local storage".

If it fits the use case, I'm thinking that may actually be a more appropriate resolution to this issue - then the affected systems could be switched to Manual mode, and the web UI workflow would revert to the old one (where Take is available by default).

Comment 8 Nick Coghlan 2014-02-11 00:09:45 UTC
With regards to your docs questions, the documentation for the system details page provides the only current description of the difference between Automated/Manual systems:

   http://beaker-project.org/docs/user-guide/systems.html#system-details

Essentially, it is expected that Automated systems will be managed primarily through the scheduler, including using the Reserve Workflow to request reservation of a system through the scheduler rather than directly. Unfortunately, the Reserve Workflow is not currently documented, but http://beaker-project.org/dev/proposals/system-page-improvements.html#related-improvements-to-the-reserve-workflow should give the general idea.

That workflow is accessible through the Scheduler->Reserve menu option, and steps the user through the process of choosing the distro tree they want to install, then choosing a system (or letting Beaker choose one automatically) and deciding how long they wish to reserve the system for.

Direct provisioning through the System Details page is then intended primarily for systems in Manual mode, rather than those in Automated mode. However, the way that is currently done potentially creates problems if the system *is* in Automated mode. Beaker 0.15 tries to avoid the worst of those problems, but also highlights how confusing the current behaviour is.

That's one of the key motivations for the system page redesign proposal - by integrating it properly with the Reserve Workflow, it allows the Provision tab on the system details page to *always* be about immediate provisioning, while scheduled provisioning can be handled with an appropriate cross link to the updated Reserve Workflow page.

Comment 9 Dan Callaghan 2014-02-11 02:50:19 UTC
(In reply to Jiri Hladky from comment #6)
> The Take button is extremely useful when dealing with installation problems.
> Often, it just a silly bug in Anaconda which requires some user input and
> can be provided via serial console. However, you need the Take button for
> that. When you schedule the provision, it takes time till the job is started
> and you cannot really follow the serial console all the time. Even worse,
> when installation time-outs, the system is gone. You really need Take button
> for this.

I'm not sure I understand how you were using the Take button here, since it's not possible to Take a system while the scheduled job is still running...

If you mean, you are running a normal scheduled job but you need an opportunity to log on to the system and fix things up if the installation goes wrong: we have an open RFE which covers that use case, bug 639938, which we are hoping to get to soon.

If you mean that you want to run through the Anaconda prompts by hand, instead of having Beaker provide a complete kickstart which does an unattended installation, you can do that now: put "manual" in ks_meta, and the kickstart will omit most of the predefined configuration so that Anaconda prompts you. Of course you still have to babysit the installation to make sure it completes before the watchdog expires.

Or if you really don't want to run scheduled jobs on the system at all, you just want to reserve it by hand, then set it to Manual instead of Automated.

Comment 10 Jiri Hladky 2014-02-11 09:05:53 UTC
Hi,

please let me clarify. My usual workflow was till now:

Find a specific system (by name) - Take it - Provision RHEL, run tests, install various new kernels, run tests - Return it back

I work in a performance QE. I need to work on the same systems to ensure that I compare apples to apples.

I use systems from kernel-hw-qe Beaker group. These are brand new systems and in many times there are problems with installation. Using the workflow with Take button is better than using Schedule Provision Workflow. With Schedule Provision I see following problems:

-long delay. Even when system is free, it takes tens of minutes before the provision will starts
-when installation goes wrong, Beaker will kill the job and give it possibly to a different user

I don't have rights to
-Loan system to myself
-Change the system status from Automated to Manual

If I understand it correctly the proposed solution is
1) Change the status of system to Manual
2) Take button will be activated and can be used to take the system
3) Return the system
4) Change status back to automated

Well, this seems over complicated to me compared to the old state where Take button was available regardless of the status of the system. 

Questions regarding proposed solution:
1) Will it be allowed to an ordinary user to temporarily switch the status of the system to Manual ? Currently I don't have the rights to do it.
2) Would it be possible to simplify the proposed flow so that the transition from Automated to Manual mode is handled fully automatically by the Beaker itself? So if a system is free there would be a button Take as it used to be regardless of the status of the system and this button would
a) change the state of the system to manual
b) take the system
Similarly, Return button would automatically perform two actions
a)return the system
b)change the status back from manual to the original state

Thanks
Jirka

Comment 11 Nick Coghlan 2014-02-11 10:13:34 UTC
In general, Beaker currently expects that systems that users are likely to want to reserve manually will be left relatively permanently in Manual mode, while those that they want to use through the scheduler will be left in Automated mode. If the systems of interest are not considered to be reliable, then it sounds like they *shouldn't* be made available through the scheduler, and that Manual mode may actually be more appropriate (in which case the Take button will appear by default, as it did in the past).

Manual reservations of Automated systems through the web UI was something that coincidentally worked, not a behaviour that was deliberately designed into the system. This is why it has never been supported through the CLI, and why in Beaker 0.15.0 it wasn't supported at all. It is only in removing it that we have discovered that some users saw it as a feature rather than a bug (which is why Beaker 0.15.1 added back the capability as long as a system loan was in place).

The intent (given the change Beaker 0.15.1) is that the "loan-self" permission in the Beaker 0.15 ACLs should cover this case. At the moment, this approach is problematic due to the awkwardness of the current "loan to self" interaction that Prarit described in the original issue report - unfortunately, the issues with that are so embedded in the current architecture of the page that we needed to completely rebuild it to make it work in a more streamlined way, and we're now at a point where we can't just deploy the fixed page, we need to ensure that current users get a chance to review the new system page design and provide feedback before we incorporate it into a release. The workflow given the system page redesign proposal is that a user is able to just click "Borrow" and then click "Take" (and these will appear in the "Quick Info" boxes at the top of the screen).

As a nearer term fix, we will also aim to ensure that the scheduler in Beaker 0.16 permits scheduling of tasks against systems in Manual mode - that should make it more tolerable for systems that need to support this workflow to be left permanently in Manual mode.

Comment 12 Jiri Hladky 2014-02-11 10:53:17 UTC
Hi Nick,

thanks for the clarification. I'm fine with "Loan to self"/"Borrow" approach as long it will not require a special rights. Could you please confirm this?

Thanks
Jirka

Comment 13 Prarit Bhargava 2014-02-11 12:34:56 UTC
(In reply to Nick Coghlan from comment #11)

> 
> The intent (given the change Beaker 0.15.1) is that the "loan-self"
> permission in the Beaker 0.15 ACLs should cover this case. At the moment,
> this approach is problematic due to the awkwardness of the current "loan to
> self" interaction that Prarit described in the original issue report -
> unfortunately, the issues with that are so embedded in the current
> architecture of the page that we needed to completely rebuild it to make it
> work in a more streamlined way, and we're now at a point where we can't just
> deploy the fixed page, we need to ensure that current users get a chance to
> review the new system page design and provide feedback before we incorporate
> it into a release. The workflow given the system page redesign proposal is
> that a user is able to just click "Borrow" and then click "Take" (and these
> will appear in the "Quick Info" boxes at the top of the screen).
> 

Nick, when and how does this review happen?

P.

Comment 14 Nick Coghlan 2014-02-11 23:42:15 UTC
Jiri: there is a separate "loan-to-self" permission in Beaker 0.15+ that is needed for the Automated Borrow->Take workflow, or else the system needs to be set to Manual. I'll add a new section to the "Migrating to Beaker 0.15" section of the release notes about handling workflows that involved manual reservations of automated systems.

Prarit: the upstream design proposal is already up at http://beaker-project.org/dev/proposals/system-page-improvements.html and requests for comment sent to the upstream development list and the Red Hat internal user list. We'll aim to get a public demo instance set up some time in the next few weeks (in time to gather feedback before 0.17 is released).

Comment 15 Jiri Hladky 2014-02-12 09:25:49 UTC
Hi Nick,

thanks for the update. It sounds reasonable. I will wait for the new version which supports Borrow->Take workflow.

Thanks!
Jirka

Comment 16 Prarit Bhargava 2014-02-12 12:58:31 UTC
(In reply to Nick Coghlan from comment #14)
> Prarit: the upstream design proposal is already up at
> http://beaker-project.org/dev/proposals/system-page-improvements.html and
> requests for comment sent to the upstream development list and the Red Hat
> internal user list. We'll aim to get a public demo instance set up some time
> in the next few weeks (in time to gather feedback before 0.17 is released).

Thanks Nick -- is there a mailing list that I need to subscribe to in order to get updates about the public demo?

P.

Comment 17 Nick Coghlan 2014-02-12 23:05:47 UTC
When the demo is live, I'll add a comment here, and also announce it on both the upstream list at https://lists.fedorahosted.org/mailman/listinfo/beaker-devel and the main internal Red Hat Beaker user list.

Comment 21 Dan Callaghan 2014-11-25 07:18:10 UTC
Beaker 19.0 has been released.


Note You need to log in before you can comment on or make changes to this bug.