Bug 1519342
| Summary: | Persisted sqlite on smart proxy for remote execution is too slow for remote execution at scale | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Satellite | Reporter: | Ivan Necas <inecas> | ||||
| Component: | Remote Execution | Assignee: | satellite6-bugs <satellite6-bugs> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Roman Plevka <rplevka> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 6.2.6 | CC: | aperotti, aruzicka, bbuckingham, bkearney, cdonnell, cduryee, egolov, ehelms, inecas, jcallaha, jhutar, mmccune, oshtaier, pmoravec, psuriset, rplevka, zhunting | ||||
| Target Milestone: | Unspecified | Keywords: | PrioBumpField, Triaged | ||||
| Target Release: | Unused | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | tfm-rubygem-smart_proxy_dynflow_core-0.1.8 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | 1416542 | Environment: | |||||
| Last Closed: | 2018-02-21 16:54:37 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1416542 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
|
Comment 2
Ivan Necas
2018-01-02 19:29:01 UTC
VERIFIED on sat6.3.0-30
- verified that the configuration is set to use the in-memory db
- performed and measured times of `date` command execution on {1..10001} hosts:
Y-axis refers to time in seconds:
REX job exeution time
1000 +-+-----------+------------+-------------+------------+-----------+-A
+ + + + + +
900 +-+ execution time A A +-+
800 +-+ +-+
| A |
700 +-+ A +-+
| |
600 +-+ A +-+
500 +-+ +-+
| A |
400 +-+ +-+
| A |
300 +-+ A +-+
| |
200 +-+ A +-+
100 +-+ A +-+
+ + + + + +
0 AEX jobe exeution time-----+-------------+------------+-----------+-+
0 200 400 600 800 1000
hosts (#)
Planning times seem to be sane as well now:
REX job planning times
3.5 +-+-----------+-------------+------------+-------------+-----------+-+
+ + + + + A
3 +-+ planningAtime A +-+
| A |
| |
2.5 +-+ A A +-+
| |
2 +-+ A +-+
| |
| |
1.5 +-+ A +-+
| |
1 +-+ A A +-+
| |
| |
0.5 +-+ A +-+
+ + + + + +
0 AEX jobe exeution time------+------------+-------------+-----------+-+
0 200 400 600 800 1000
hosts (#)
I also tried to run the above on 10k hosts (10001):
execution time: 11098
planning time: 76.21
(attaching the complete dataset as rex_times.csv)
used script (server side):
for i in {0..1000..100}; do
HOSTS=`awk -F "," '{print $2","}' fero_hosts | head -n $i`
printf "$i\t" >> times.csv
/usr/bin/time --output time.txt -f "%e" hammer job-invocation create --inputs "command=date" --search-query="name ^ (${HOSTS})" --job-template-id=107
cat time.txt >> times.csv
echo "" > time.txt
done
Created attachment 1378663 [details]
rex_times.dat
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
> >
> > For information on the advisory, and where to find the updated files, follow the link below.
> >
> > If the solution does not work for you, open a new bug report.
> >
> > https://access.redhat.com/errata/RHSA-2018:0336
|