Bug 1519342

Summary:

Persisted sqlite on smart proxy for remote execution is too slow for remote execution at scale

Product:

Red Hat Satellite

Reporter:

Ivan Necas <inecas>

Component:

Remote Execution

Assignee:

satellite6-bugs <satellite6-bugs>

Status:

CLOSED ERRATA

QA Contact:

Roman Plevka <rplevka>

Severity:

high

Docs Contact:

Priority:

high

Version:

6.2.6

CC:

aperotti, aruzicka, bbuckingham, bkearney, cdonnell, cduryee, egolov, ehelms, inecas, jcallaha, jhutar, mmccune, oshtaier, pmoravec, psuriset, rplevka, zhunting

Target Milestone:

Unspecified

Keywords:

PrioBumpField, Triaged

Target Release:

Unused

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

tfm-rubygem-smart_proxy_dynflow_core-0.1.8

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

1416542

Environment:

Last Closed:

2018-02-21 16:54:37 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

1416542

Bug Blocks:

Attachments:

Description	Flags
rex_times.dat	none

Comment 2 Ivan Necas 2018-01-02 19:29:01 UTC

*** Bug 1416542 has been marked as a duplicate of this bug. ***

Comment 3 Roman Plevka 2018-01-08 18:40:53 UTC

VERIFIED on sat6.3.0-30

- verified that the configuration is set to use the in-memory db
- performed and measured times of `date` command execution on {1..10001} hosts:


Y-axis refers to time in seconds:

                              REX job exeution time                           
                                                                               
  1000 +-+-----------+------------+-------------+------------+-----------+-A   
       +             +            +             +            +             +   
   900 +-+                                          execution time  A A  +-+   
   800 +-+                                                               +-+   
       |                                                     A             |   
   700 +-+                                             A                 +-+   
       |                                                                   |   
   600 +-+                                      A                        +-+   
   500 +-+                                                               +-+   
       |                                 A                                 |   
   400 +-+                                                               +-+   
       |                          A                                        |   
   300 +-+                 A                                             +-+   
       |                                                                   |   
   200 +-+           A                                                   +-+   
   100 +-+    A                                                          +-+   
       +             +            +             +            +             +   
     0 AEX jobe exeution time-----+-------------+------------+-----------+-+   
       0            200          400           600          800           1000 
                                     hosts (#)                                 


Planning times seem to be sane as well now:


                             REX job planning times                            
                                                                               
  3.5 +-+-----------+-------------+------------+-------------+-----------+-+   
      +             +             +            +             +             A   
    3 +-+                                            planningAtime    A  +-+   
      |                                                             A      |   
      |                                                                    |   
  2.5 +-+                                A            A                  +-+   
      |                                                                    |   
    2 +-+                                      A                         +-+   
      |                                                                    |   
      |                                                                    |   
  1.5 +-+                         A                                      +-+   
      |                                                                    |   
    1 +-+           A      A                                             +-+   
      |                                                                    |   
      |                                                                    |   
  0.5 +-+    A                                                           +-+   
      +             +             +            +             +             +   
    0 AEX jobe exeution time------+------------+-------------+-----------+-+   
      0            200           400          600           800           1000 
                                    hosts (#)                                  
                                     


I also tried to run the above on 10k hosts (10001):
execution time: 11098
planning time:  76.21

(attaching the complete dataset as rex_times.csv)

used script (server side):

for i in {0..1000..100}; do
  HOSTS=`awk -F "," '{print $2","}' fero_hosts | head -n $i`
  printf "$i\t" >> times.csv
  /usr/bin/time --output time.txt -f "%e" hammer job-invocation create --inputs "command=date" --search-query="name ^ (${HOSTS})" --job-template-id=107
  cat time.txt >> times.csv
  echo "" > time.txt
done

Comment 4 Roman Plevka 2018-01-08 18:42:15 UTC

Created attachment 1378663 [details]
rex_times.dat

Comment 5 Satellite Program 2018-02-21 16:54:37 UTC

Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
> > 
> > For information on the advisory, and where to find the updated files, follow the link below.
> > 
> > If the solution does not work for you, open a new bug report.
> > 
> > https://access.redhat.com/errata/RHSA-2018:0336