Bug 1603706

Summary:

multiple virt-who configurations create multiples users == *undefined method `[]' for nil:NilClass* error on Hypervisor tasks

Product:

Red Hat Satellite

Reporter:

Waldirio M Pinheiro <wpinheir>

Component:

Subscriptions - virt-who

Assignee:

satellite6-bugs <satellite6-bugs>

Status:

CLOSED DEFERRED

QA Contact:

Perry Gagne <pgagne>

Severity:

high

Docs Contact:

Priority:

urgent

Version:

6.3.2

CC:

aeladawy, akapse, arahaman, bbuckingham, bcourt, bkearney, cmarinea, dpeess, jalviso, jmcdonald, khowell, ktordeur, mhulan, mjia, mmccune, patalber, phess, pmoravec, rjerrido, syangsao, wpinheir, wpoteat

Target Milestone:

Released

Keywords:

PrioBumpGSS, Triaged

Target Release:

Unused

Hardware:

All

OS:

All

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Clones:

1615405 (view as bug list)

Environment:

Last Closed:

2019-01-18 19:17:50 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

1615405

Bug Blocks:

Attachments:

Description	Flags
virt-who outputs and snippets from production.log	none
trace from 6.5 snap 1	none

Description Waldirio M Pinheiro 2018-07-19 18:30:20 UTC

Description of problem:
Customer with +1 different configuration files *different vCenters*, then virt-who is able to reach all of them and collect the info BUT the behaviour to prepare the data and upload to Satellite is different *causing error on Satellite side*

Version-Release number of selected component (if applicable):
virt-who-0.21.7-1.el7_5.noarch
Satellite 6.3.x

How reproducible:
Just on customer environment. I'm not able to reproduce locally.

Steps to Reproduce:
1. Configure multiple vCenters
2. Execute virt-who -o
3. Check the output

Actual results:
are are able to see one *Sending updated Host-to-guest mapping to* to each *Hosts-to-guests mapping for config*, then customer with 4 config files, we will see
---
Hosts-to-guests mapping for config conf_file_1
Sending updated Host-to-guest mapping to "ORG"

Hosts-to-guests mapping for config conf_file_2
Sending updated Host-to-guest mapping to "ORG"

Hosts-to-guests mapping for config conf_file_3
Sending updated Host-to-guest mapping to "ORG"

Hosts-to-guests mapping for config conf_file_4
Sending updated Host-to-guest mapping to "ORG"
---

Expected results:
See just one *Host-to-guest mapping to* according below
---
Hosts-to-guests mapping for config conf_file_1
Hosts-to-guests mapping for config conf_file_2
Hosts-to-guests mapping for config conf_file_3
Hosts-to-guests mapping for config conf_file_4

Sending updated Host-to-guest mapping to "ORG"
---


Additional info:
More detail below.

Comment 4 Barnaby Court 2018-07-20 15:01:31 UTC

Waldirio, I see one error in the candlepin log file:

RROR org.hornetq.core.server - HQ224016: Caught exception java.lang.IllegalStateException: Can't write records bigger than the bufferSize(102400) on the journal

I would recommend setting the 
candlepin.audit.hornetq.large_msg_size
property in the /etc/candlepin/candlepin.conf file to something larger than 102400 and see if it makes a difference. 

Please try setting that value to 1024000 and see if that fixes the problem.

Comment 5 Barnaby Court 2018-07-20 15:02:51 UTC

Waldirio, if setting the larger value does not not fix the problem please scan the candlepin log file for new errors If you are no longer seeing the HQ224016 error (or something else that stands out) in the candlepin log then the next place to look would be the foreman log.

Comment 7 Pablo Hess 2018-07-23 18:45:46 UTC

On my reproducer increasing large_msg_size to 1024000 as per c#4 resolves the failed upload issue.


I'll be attaching log files and virt-who -o output files (in a tarball) following this scheme:

* Running virt-who on the capsule, identified as $CAPSULE on the logs
* Two hypervisors, $HYPERVISOR01 and $HYPERVISOR02
* Two virt-who config files, virt-who-config-1.conf and virt-who-config-2.conf pointing to respective hypervisors.


Run virt-who -o -c virt-who-config-1.conf on capsule, capture command output on capsule command line and production.log on satellite.
* Files:
   - virt-who-hypervisor01.log
   - virt-who-1-output.out


Run virt-who -o -c virt-who-config-2.conf on capsule, capture command output on capsule command line and production.log on satellite.
* Files:
   - virt-who-hypervisor02.log
   - virt-who-2-output.out


Run virt-who -o on capsule, capture output, and production.log on satellite with a backtrace:
* Files:
   - virt-who-all-together-07.log
   - virt-who-all-together-output-07.out



Add line to /etc/candlepin/candlepin.conf:

 candlepin.audit.hornetq.large_msg_size=1024000

Restart tomcat service, run virt-who -o a few times on capsule (see why below).

Then run virt-who -o on capsule, capture command output, and production.log with no backtrace.
* Files:
   - virt-who-all-together.log
   - virt-who-all-together-output.out



## NOTE: not every run of virt-who was failing before modifying the config file, only between 20% and 50% of runs were failing.

## NOTE 2: after modifying candlepin config file to increase or decrease large_msg_size the first run of virt-who goes as if no modification was made.

Example:

1. Candlepin is running with increased large_msg_size=1024000. No tracebacks on the logs after 10 runs of virt-who -o.
2. Reduce large_msg_size by commenting out the line in candlepin conf file.
3. Restart tomcat
4. Run virt-who -o, no traceback appears.
5. Run virt-who -o again 30 seconds after the last one finished, get a traceback.
6. Further runs of virt-who -o will produce a traceback between 20% and 50% of the time.

Comment 8 Pablo Hess 2018-07-23 18:48:30 UTC

Created attachment 1470037 [details]
virt-who outputs and snippets from production.log

As described on my latest comment.

Comment 33 Marek Hulan 2018-11-09 12:42:40 UTC

Created attachment 1503629 [details]
trace from 6.5 snap 1

Comment 38 William Poteat 2019-01-18 19:17:50 UTC

This BZ has become far too unwieldy. Customer issues are attached simply because there is a problem with virt-who and not because of the specific scenario. This will not help get the issues resolved, just the opposite. Individual problems are likely to be ignored.


If the issue is about a db error in candlepin because the IN clause is too large you should use:
https://bugzilla.redhat.com/show_bug.cgi?id=1599752


If the issue is that you are using multiple users to do hypervisor updates for one org, instead of a single one, you should use:
https://bugzilla.redhat.com/show_bug.cgi?id=1667545


If the issue is that you are trying to submit a hypervisor in the report that does not have a value for the field you have chosen as the hostname then you should use:
https://bugzilla.redhat.com/show_bug.cgi?id=1667522