Description of problem: Sometimes import of distribution to beaker can take longer time than desirable. For higher flexibility it would be nice to have a chance to schedule tests also for distribution which are not imported. For example libvirt contains some code which is able to do this. It utilizes .treeinfo file which can be found in Fedora and RHEL operating systems. This file contains all the important information to successfully start installation. The submission could look similarly to this: <distroRequires/> <url url="http://example.com/path/to/distribution-tree/"/> </distroRequires>
It may require also some sort of generic family which will allow also "stranger" unidentified distributions for submission.
This request covers distributions which are not covered by automatic import. Distribution trees which are built on unusual places or new families which have to be covered yet for example.
Bug 1032881 covers the fact that the tangentially related that allows "naked distro imports" isn't documented properly (it is only described in the help text for distro-import). There's definitely merit to the idea of being able to provision experimental builds, although it would be desirable to offer that in a way that also worked for manual system provisioning.
Bringing over a suggestion from Bill on bug 1124474. Bill's suggestion was for a new element that would be mutually exclusive with the existing distroRequires: <distro name="My-Custom-Distro" arch="x86_64"> <family>RedHatEnterpriseLinux6</family> <kernel>http://fqdn/path/to/kernel</kernel> <initrd>http://fqdn/path/to/initrd</initrd> <repo>http://fqdn/path/to/distro</repo> </distro> One thing we would need to be careful with is managing uniqueness constraints - it would be incredibly annoying if a distro import failed because someone had previously run a trial job using those details. An alternative would be to allow recipes to be run without distro data, rather than implicitly creating a distro. If the links aren't to a CDN, then it wouldn't be advisable to lock the recipe to an appropriate lab controller - that would need to be mentioned in the associated docs.
*** Bug 1124474 has been marked as a duplicate of this bug. ***
Hmm, it just occurred to me that this is also relevant to our long term goal of separating out the distro library as an independent service - at that point, we're likely to want to be passing these paths around more explicitly in the scheduler anyway.
Beaker 19 will have the ability to run the test harness in a container (Bug:1131388) (https://beaker-project.org/docs-develop/whats-new/next/test-harness-in-container.html). This effectively means that at tests can be run on any user specified distro without that needing to be already imported into Beaker. However, you must have a docker registry with that image. Of course, the kinds of tests can be run will be limited i am sure.
Anwesha is going to take a stab at writing up a design proposal for this feature.
I have posted a proposal for this RFE here: https://gerrit.beaker-project.org/#/c/5793/
Since that patch is just for the design proposal, let's leave this at ASSIGNED until there is an implementation ready for review.
As the denormalised distro metadata (which is available at the time of job submission) needs to be stored in the installation table, and the installation table is only created at the time of provisioning, I am currently exploring whether I can change this to the be created at the time of job submission. For this, there are several bits of information that need to be decoupled from recipe.distro_tree - kernel_options being one of them. Looking into how to handle this right now.
https://gerrit.beaker-project.org/#/c/5795/ This is the very first part of the patch, which involves creating the installation row at submission time rather than provisioning time.
https://gerrit.beaker-project.org/#/c/5839/ This is the second part of the patch, which involves: 1. Updating the beaker-job Relax NG schema to allow definition of user-defined distro metadata 2. Adding columns to the installation table for the above metadata 3. Making the distro_tree_id column nullable in installation due to the new available method for defining distros 4. UI updates in the recipe and installation page for the case of user-defined distros (as the distro is not registered - and therefore the distro information page will not be available. Distro data is displayed in the installation tab instead)
The proposal for this RFE can be viewed here: https://beaker-project.org/dev/proposals/allow-installation-user-defined-distro.html
https://gerrit.beaker-project.org/#/c/5860 This is the final part of the patch. In this patch, 1. Creation of kickstart templates for user-defined distros are customized according to provided metadata 2.labcontroller code updated to use user provided tree, kernel and initrd urls instead of registered distro trees, to handle the user-defined distro scenario
Due to issues with debugging the final part of the patch (due to the large size) I have been working on dividing it into smaller fragments. So far I have posted: * Refactoring compatible_with_distro_tree method to use denormalised distro metadata https://gerrit.beaker-project.org/#/c/5920/ * Adding and populating the distro metadata columns to installation table https://gerrit.beaker-project.org/#/c/5921/
Additional sub-patches: * Making the scheduler use installation table parameters instead of distro_tree parameters https://gerrit.beaker-project.org/#/c/5923/ * Making kickstart templates use installation columns where possible https://gerrit.beaker-project.org/#/c/5926/ * Making beaker-provision use the installation columns https://gerrit.beaker-project.org/#/c/5927/ * Adding <distro/> tag option for specifying distro metadata in relax ng schema https://gerrit.beaker-project.org/#/c/5929/ * Enabling the feature for user-defined distro handling https://gerrit.beaker-project.org/#/c/5930/
(In reply to Don Zickus from comment #19) Anwesha has been making good progress on the patch series for this feature, but the biggest issue we are currently facing is with modifying one of the central, very complex scheduler queries (in the schedule_queued_recipes() routine) to work properly when recipe.distro_tree_id is NULL. I've filed a separate bug about this problematic query and how we can simplify or eliminate it: bug 1519589.
my mistake, 25.0 is the next release
There are a couple of edge cases where this breaks, I just noticed on beaker-devel. They are both cases where recipe.installation is None, and the code is not expecting that to be true. There are two different reasons why recipe.installation can be None: * on an old recipe, which was Cancelled or Aborted before it was scheduled and thus recipe.provision() was never called and thus recipe.installation was never created * on a recipe which was queued at the time of the upgrade, thus recipe.provision() had not been called yet There are at least two places where the new code is assuming that recipe.installation is not None. This is what beakerd is currently spewing in a loop, for the recipes which were queued before the upgrade: bkr.server.tools.beakerd DEBUG Checking for queued recipes which are runnable on ibm-x3250m4-18.rhts.eng.bos.redhat.com bkr.server.tools.beakerd ERROR Error in schedule_pending_system(40) Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/bkr/server/tools/beakerd.py", line 387, in schedule_pending_systems schedule_pending_system(system_id) File "/usr/lib/python2.6/site-packages/bkr/server/tools/beakerd.py", line 419, in schedule_pending_system if not recipe.candidate_systems().filter(System.id == system.id).first(): File "/usr/lib/python2.6/site-packages/bkr/server/model/scheduler.py", line 3130, in candidate_systems systems = systems.filter(System.compatible_with_distro_tree(arch=self.installation.arch, AttributeError: 'NoneType' object has no attribute 'arch' And if you open the recipe page, you get a JS error "TypeError: installation is null" while rendering a JST template (not sure which one but it should be easy enough to find).
(In reply to Dan Callaghan from comment #23) I think the best solution might be to adjust the database migration so it creates the missing installation row for recipes which are queued at the time of migration (not sure how hard this will be?). That should fix the crash in schedule_pending_system(). And it means we won't need to carry a special code path in the scheduler just for those queued recipes, since there will only be a small finite number of them and then never again. However for the web UI we *should* make it handle the case where recipe.installation is missing. I don't think it's practical to go back and fill in an installation row for every recipe in the database back to the start.
https://gerrit.beaker-project.org/#/c/5969/ This CR handles the web UI side of things
Back to ASSIGNED while we fix up some of the issues uncovered.
This ones the database migration fix for recipes that are queued at migration time: https://gerrit.beaker-project.org/#/c/5970/
https://gerrit.beaker-project.org/#/c/5971/ This change is for the issues with the old UI
beakerd on beaker-devel hit this exception, running commit 7f50fedf9: bkr.server.tools.beakerd ERROR Error in update_dirty_job(13711) Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/bkr/server/tools/beakerd.py", line 98, in update_dirty_jobs update_dirty_job(job_id) File "/usr/lib/python2.6/site-packages/bkr/server/tools/beakerd.py", line 113, in update_dirty_job job.update_status() File "/usr/lib/python2.6/site-packages/bkr/server/model/scheduler.py", line 1259, in update_status self._update_status() File "/usr/lib/python2.6/site-packages/bkr/server/model/scheduler.py", line 1298, in _update_status mail.job_notify(self) File "/usr/lib/python2.6/site-packages/bkr/server/mail.py", line 68, in job_notify body=failed_recipes(job), File "/usr/lib/python2.6/site-packages/bkr/server/mail.py", line 44, in failed_recipes arch = recipe.installation.arch.arch if recipe.installation else recipe.distro_tree.arch.arch \ AttributeError: 'NoneType' object has no attribute 'arch' The job in question is J:13711 and its recipe is R:20611. It was a RHEL7 dogfood recipe running on an OpenStack instance.
(In reply to Dan Callaghan from comment #30) Indeed, that recipe has an installation row but the new columns are all NULL: mysql> select * from installation where recipe_id = 20611 \G *************************** 1. row *************************** id: 23528 distro_tree_id: 23463 kernel_options: console=tty0 console=ttyS0,115200n8 ks=http://beaker-devel.app.eng.bos.redhat.com/kickstart/17619 ksdevice=bootif netbootloader=pxelinux.0 serial rendered_kickstart_id: 17619 created: 2018-01-12 03:13:03 rebooted: 2018-01-12 03:14:33 install_started: 2018-01-12 03:15:15 install_finished: 2018-01-12 03:24:06 postinstall_finished: 2018-01-12 03:25:03 system_id: NULL recipe_id: 20611 tree_url: NULL initrd_path: NULL kernel_path: NULL distro_name: NULL osmajor: NULL osminor: NULL variant: NULL arch_id: NULL
I fixed it on beaker-devel like this: --- a/Server/bkr/server/mail.py +++ b/Server/bkr/server/mail.py @@ -41,7 +41,7 @@ def failed_recipes(job): if recipe.is_failed(): distro_name = recipe.installation.distro_name if recipe.installation else \ recipe.distro_tree.distro.name if recipe.distro_tree else "Unknown" - arch = recipe.installation.arch.arch if recipe.installation else recipe.distro_tree.arch.arch \ + arch = recipe.installation.arch.arch if recipe.installation.arch else recipe.distro_tree.arch.arch \ if recipe.distro_tree else "Unknown" msg = "%s\t\tRecipeID: %s Arch: %s System: %s Distro: %s Status: %s Result: %s <%s>\n" \ % (msg, recipe.id, arch, recipe.resource, distro_name, recipe.status, recipe.result, Seems like we need a test case which specifically covers a recipe which was started *before* the upgrade and then finishes *after* the upgrade, to ensure that beakerd can do update_dirty_jobs() on it successfully. However I still confused about why R:20611 appears like a recipe which was started *before* this patch, when in fact it was submitted to beaker-devel at a time when beaker-devel was already upgraded to commit dedf5cc29. Unless we had some old mod_wsgi workers left behind running the old code but still accepting HTTP requests, or something bad like that...
Some other problems I noticed... Some dogfood jobs from 10 Jan and 12 Jan (including the one above) have ended up with some incorrect attributes in their results.xml. That's the Beaker results XML product by bkr job-results. Specifically: https://beaker-project.org/jenkins-results/beaker-review-checks-dogfood-RedHatEnterpriseLinux7/1552/beaker/J:13665/results.xml <recipe arch="None" distro="None" duration="3:22:04" family="None" finish_time="2018-01-10 06:34:17" id="20565" job_id="13665" kernel_options="" kernel_options_post="" kickstart_url="http://CENSORED.redhat.com/kickstart/17575" ks_meta="method=http selinux=--disabled" recipe_set_id="17901" result="Pass" role="None" start_time="2018-01-10 03:12:13" status="Completed" system="host-192-168-11-13.openstacklocal" variant="None" whiteboard="Nose Tests"> The arch, distro, variant, and family attributes should all be filled in with proper values for RHEL7. It occurred to me to also check the results XML for older, archived jobs. That actually crashes completely with a 500. The traceback for that is: bkr.server ERROR Exception on /jobs/12340.xml [GET] Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/flask/app.py", line 1817, in wsgi_app response = self.full_dispatch_request() File "/usr/lib/python2.6/site-packages/flask/app.py", line 1477, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/lib/python2.6/site-packages/flask/app.py", line 1381, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/lib/python2.6/site-packages/flask/app.py", line 1475, in full_dispatch_request rv = self.dispatch_request() File "/usr/lib/python2.6/site-packages/flask/app.py", line 1461, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/usr/lib/python2.6/site-packages/bkr/server/jobs.py", line 1099, in job_xml job.to_xml(clone=False, include_logs=include_logs), File "/usr/lib/python2.6/site-packages/bkr/server/model/scheduler.py", line 1200, in to_xml job.append(rs.to_xml(clone=clone, include_enclosing_job=False, **kwargs)) File "/usr/lib/python2.6/site-packages/bkr/server/model/scheduler.py", line 1692, in to_xml recipeSet.append(r.to_xml(clone, include_enclosing_job=False, **kwargs)) File "/usr/lib/python2.6/site-packages/bkr/server/model/scheduler.py", line 3042, in to_xml for guest in self.guests] File "/usr/lib/python2.6/site-packages/bkr/server/model/scheduler.py", line 2973, in to_xml method = self.reduced_install_options().ks_meta.get('method', None) File "/usr/lib/python2.6/site-packages/bkr/server/model/scheduler.py", line 2530, in reduced_install_options self.installation.arch)) File "/usr/lib/python2.6/site-packages/bkr/server/model/distrolibrary.py", line 171, in install_options_for_distro osmajor_name, osminor, variant, arch)) File "/usr/lib/python2.6/site-packages/bkr/server/model/distrolibrary.py", line 43, in default_install_options_for_distro name, version = split_osmajor_name_version(osmajor_name) File "/usr/lib/python2.6/site-packages/bkr/server/model/distrolibrary.py", line 33, in split_osmajor_name_version return re.match(r'(.*?)(rawhide|\d*)$', osmajor).groups() File "/usr/lib64/python2.6/re.py", line 137, in match return _compile(pattern, flags).match(string) TypeError: expected string or buffer That traceback reveals a few problems... Obviously it shouldn't crash at all. But also, why is it recomputing reduced_install_options() for this recipe which is already long completed?
https://gerrit.beaker-project.org/#/c/5972/ --> beaker mail issue https://gerrit.beaker-project.org/#/c/5973/ --> results XML issue
https://gerrit.beaker-project.org/#/c/5980/
With cloning Jan's job in comment 26 I've found another problem. Traceback is: Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: Traceback (most recent call last): Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/CherryPy-2.3.0-py2.6.egg/cherrypy/_cphttptools.py", line 121, in _run Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: self.main() Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/CherryPy-2.3.0-py2.6.egg/cherrypy/_cphttptools.py", line 264, in main Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: body = page_handler(*virtual_path, **self.params) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "<string>", line 2, in clone Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/bkr/server/identity.py", line 288, in require Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: return func(*args, **kwargs) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "<string>", line 3, in clone Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/turbogears/controllers.py", line 361, in expose Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: *args, **kw) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/bkr/server/wsgi.py", line 91, in run_with_transaction_noop Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: return func(*args, **kwargs) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "<generated code>", line 0, in _expose Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/peak/rules/core.py", line 153, in __call__ Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: return self.body(*args, **kw) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/turbogears/controllers.py", line 390, in <lambda> Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: fragment, options, args, kw))) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/turbogears/controllers.py", line 425, in _execute_func Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: output = errorhandling.try_call(func, *args, **kw) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/turbogears/errorhandling.py", line 77, in try_call Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: return func(self, *args, **kw) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "<string>", line 3, in clone Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/turbogears/controllers.py", line 207, in validate Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: return errorhandling.run_with_errors(errors, func, *args, **kw) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/turbogears/errorhandling.py", line 118, in run_with_errors Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: return func(self, *args, **kw) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/bkr/server/jobs.py", line 381, in clone Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: textxml = lxml.etree.tostring(job.to_xml(clone=True), Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/bkr/server/model/scheduler.py", line 1200, in to_xml Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: job.append(rs.to_xml(clone=clone, include_enclosing_job=False, **kwargs)) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/bkr/server/model/scheduler.py", line 1692, in to_xml Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: recipeSet.append(r.to_xml(clone, include_enclosing_job=False, **kwargs)) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/bkr/server/model/scheduler.py", line 3046, in to_xml Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: include_enclosing_job=include_enclosing_job, **kwargs) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/bkr/server/model/scheduler.py", line 2181, in to_xml Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: drs = self.installation.distro_to_xml() Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib/python2.6/site-packages/bkr/server/model/installation.py", line 67, in distro_to_xml Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: E.tree(url=self.tree_url), Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib64/python2.6/site-packages/lxml/builder.py", line 210, in __call__ Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: get(dict)(elem, attrib) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: File "/usr/lib64/python2.6/site-packages/lxml/builder.py", line 197, in add_dict Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: attrib[k] = typemap[type(v)](None, v) Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: KeyError: <type 'NoneType'> Feb 15 02:33:18 beaker-devel.app.eng.bos.redhat.com beaker-server[25526]: bkr.server.wsgi DEBUG Rolling back for 500 response
Looking closer into the above error it turns out that the data is malformed and must have been created with a code state which later on prevented the data to get into this. The installation is currently: Installation(created=datetime.datetime(2018, 1, 10, 16, 3, 44), system=dev-kvm-guest-08.rhts.eng.bos.redhat.com, distro_tree=DistroTree(distro=Distro(name=u'RHEL5.9-Server-20120919.0.n'), variant=u'', arch=Arch(u'i386')), kernel_options=u'console=ttyS0 ks=http://beaker-devel.app.eng.bos.redhat.com/kickstart/17585 ksdevice=bootif netbootloader=pxelin ux.0 serial', rendered_kickstart=None, rebooted=datetime.datetime(2018, 1, 10, 16, 4, 13), install_started=datetime.datetime(2018, 1, 10, 16, 5, 13), install_finished=datetime.datetime(2018, 1, 10, 16, 13, 56), postinstall_finished=dateti me.datetime(2018, 1, 10, 16, 14, 11), tree_url=None, initrd_path=None, kernel_path=None, arch=None, distro_name=None, osmajor=None, osminor=None, variant=None) but the distro_requires is empty. The XML parsing code during XML submitting already ensures that either a distro_requires will need to be given or a distro. So moving this back to ON_QA.
I've tested now various scenarios mostly involving borked XML: * using distro_requires and distro together * not specifying mandatory elements for distro * some variations with invalid values I've tested submitting XML in the WebUI and from the Beaker client which all work. Apart from a minor glitch filed as Bug 1548889 I think we're good to go.
Beaker 25.0 has been released. Release notes are available upstream: https://beaker-project.org/docs/whats-new/release-25.html