Bug 1268756 - Logrotation breaks running process leaving satellite not responding
Summary: Logrotation breaks running process leaving satellite not responding
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.1.2
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: Unspecified
Assignee: satellite6-bugs
QA Contact: Katello QA List
URL:
Whiteboard:
Depends On:
Blocks: 1122832 1315274
TreeView+ depends on / blocked
 
Reported: 2015-10-05 07:41 UTC by Peter Vreman
Modified: 2019-12-16 04:59 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1315274 (view as bug list)
Environment:
Last Closed: 2016-11-17 16:18:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Peter Vreman 2015-10-05 07:41:48 UTC
Description of problem:
During the weekly logrotation in the weekend Satellite 6 stops responding. We have seen this issue now for the third weekend in a row. Below are the log lines of the first and third time.
It looks like it is related to having ContentManagement (Pulp) processes running during the time of the logrotation


Weekend of 04-Oct

[crash] root@li-lc-1578:~# tail -n10 /var/log/foreman/production.log-20151004 | cut -c-120
2015-10-04 03:43:40 [I] Processing by Katello::Api::Rhsm::CandlepinProxiesController#server_status as JSON
2015-10-04 03:43:42 [I] Completed 200 OK in 1393ms (Views: 1.2ms | ActiveRecord: 0.0ms)
2015-10-04 03:43:42 [I] Processing by Katello::Api::Rhsm::CandlepinProxiesController#hypervisors_update as JSON
2015-10-04 03:43:42 [I]   Parameters: {"34303935-3038-5a43-4a32-3232304c4346"=>[{"guestId"=>"42175492-b928-4d5e-a854-4ad
2015-10-04 03:43:45 [I] Completed 200 OK in 3084ms (Views: 1006.3ms | ActiveRecord: 28.6ms)
2015-10-04 03:43:49 [I] Completed 200 OK in 22456ms (Views: 3527.8ms | ActiveRecord: 518.0ms)
2015-10-04 03:44:51 [I] Processing by ForemanTasks::Api::TasksController#bulk_search as */*
2015-10-04 03:44:51 [I]   Parameters: {"per_page"=>9999, "searches"=>[{"type"=>"task", "task_id"=>"a0b6ece5-fd50-49d0-94
2015-10-04 03:44:51 [I] Authorized user hoici(hoici )
2015-10-04 03:45:13 [I] Completed 200 OK in 22572ms (Views: 3647.5ms | ActiveRecord: 570.2ms)

[crash] root@li-lc-1578:~# head -n20 /var/log/foreman/production.log-20151005
2015-10-04 03:46:17 [I] Client disconnected.
2015-10-04 03:46:17 [I] Client disconnected.
2015-10-04 03:46:17 [I] Client disconnected.
2015-10-04 03:46:17 [I] Client disconnected.
2015-10-04 03:46:18 [W] message could not be sent Dynflow::Listeners::Serialization::Protocol::Done[request_id: 76] because Broken pipe
2015-10-04 03:46:44 [W] message could not be sent Dynflow::Listeners::Serialization::Protocol::Done[request_id: 85] because Broken pipe
2015-10-04 03:46:44 [I] Connecting to database specified by database.yml
2015-10-04 03:46:51 [W] message could not be sent Dynflow::Listeners::Serialization::Protocol::Done[request_id: 94] because Broken pipe
2015-10-04 03:46:53 [W] message could not be sent Dynflow::Listeners::Serialization::Protocol::Done[request_id: 95] because Broken pipe
2015-10-04 03:47:10 [W] Creating scope :completer_scope. Overwriting existing method Organization.completer_scope.
2015-10-04 03:47:10 [W] Creating scope :completer_scope. Overwriting existing method Location.completer_scope.
2015-10-04 03:47:23 [W] Creating scope :latest. Overwriting existing method Scaptimony::ArfReport.latest.
2015-10-04 03:48:15 [W] message could not be sent Dynflow::Listeners::Serialization::Protocol::Done[request_id: 90] because Broken pipe
2015-10-04 03:48:16 [I] Connecting to database specified by database.yml
2015-10-04 03:48:20 [W] message could not be sent Dynflow::Listeners::Serialization::Protocol::Done[request_id: 75] because Broken pipe
2015-10-04 03:48:41 [W] message could not be sent Dynflow::Listeners::Serialization::Protocol::Done[request_id: 79] because Broken pipe
2015-10-04 03:48:42 [W] Creating scope :completer_scope. Overwriting existing method Organization.completer_scope.
2015-10-04 03:48:42 [W] Creating scope :completer_scope. Overwriting existing method Location.completer_scope.
2015-10-04 03:48:56 [W] Creating scope :latest. Overwriting existing method Scaptimony::ArfReport.latest.
2015-10-04 03:49:01 [W] message could not be sent Dynflow::Listeners::Serialization::Protocol::Done[request_id: 81] because Broken pipe

[crash] root@li-lc-1578:/var/log# tail -n10 cron-20151004
Oct  4 03:20:01 li-lc-1578 CROND[18620]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct  4 03:25:01 li-lc-1578 CROND[20522]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct  4 03:30:01 li-lc-1578 CROND[22197]: (foreman) CMD (   /usr/sbin/foreman-rake trends:counter >>/var/log/foreman/cron.log 2>&1)
Oct  4 03:30:01 li-lc-1578 CROND[22196]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct  4 03:30:01 li-lc-1578 CROND[22198]: (foreman) CMD (   /usr/sbin/foreman-rake ldap:refresh_usergroups >>/var/log/foreman/cron.log 2>&1)
Oct  4 03:35:01 li-lc-1578 CROND[23780]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct  4 03:40:01 li-lc-1578 CROND[25335]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct  4 03:45:01 li-lc-1578 CROND[26627]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct  4 03:46:02 li-lc-1578 anacron[9789]: Job `cron.daily' started
Oct  4 03:46:02 li-lc-1578 run-parts(/etc/cron.daily)[26952]: starting logrotate

[crash] root@li-lc-1578:/var/log# head -n20 cron
Oct  4 03:46:21 li-lc-1578 run-parts(/etc/cron.daily)[27170]: finished logrotate
Oct  4 03:46:21 li-lc-1578 run-parts(/etc/cron.daily)[26952]: starting makewhatis.cron
Oct  4 03:46:23 li-lc-1578 run-parts(/etc/cron.daily)[27342]: finished makewhatis.cron
Oct  4 03:46:23 li-lc-1578 run-parts(/etc/cron.daily)[26952]: starting mlocate.cron
Oct  4 03:46:46 li-lc-1578 run-parts(/etc/cron.daily)[27496]: finished mlocate.cron
Oct  4 03:46:46 li-lc-1578 run-parts(/etc/cron.daily)[26952]: starting prelink
Oct  4 03:46:46 li-lc-1578 run-parts(/etc/cron.daily)[27508]: finished prelink
Oct  4 03:46:46 li-lc-1578 run-parts(/etc/cron.daily)[26952]: starting readahead.cron
Oct  4 03:46:46 li-lc-1578 run-parts(/etc/cron.daily)[27520]: finished readahead.cron
Oct  4 03:46:46 li-lc-1578 run-parts(/etc/cron.daily)[26952]: starting rhsmd
Oct  4 03:47:47 li-lc-1578 run-parts(/etc/cron.daily)[27612]: finished rhsmd
Oct  4 03:47:47 li-lc-1578 run-parts(/etc/cron.daily)[26952]: starting tmpwatch
Oct  4 03:47:47 li-lc-1578 run-parts(/etc/cron.daily)[27650]: finished tmpwatch
Oct  4 03:47:47 li-lc-1578 anacron[9789]: Job `cron.daily' terminated
Oct  4 03:47:47 li-lc-1578 anacron[9789]: Normal exit (1 job run)
Oct  4 03:50:02 li-lc-1578 CROND[28029]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct  4 03:55:01 li-lc-1578 CROND[28796]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct  4 04:00:01 li-lc-1578 CROND[29371]: (root) CMD (/opt/hoi/bin/hoi-cron-wrapper /opt/hoi/bin/hoi-updater --run-cron)
Oct  4 04:00:01 li-lc-1578 CROND[29373]: (foreman) CMD (   /usr/sbin/foreman-rake ldap:refresh_usergroups >>/var/log/foreman/cron.log 2>&1)
Oct  4 04:00:01 li-lc-1578 CROND[29372]: (foreman) CMD (   /usr/sbin/foreman-rake trends:counter >>/var/log/foreman/cron.log 2>&1)



Weekend of 20-Sep

[crash] root@li-lc-1578:/var/log/foreman# grep '^2015' /var/log/foreman/production.log-20150920 | tail -n10 | cut -c-120
2015-09-20 03:10:12 [I] Authorized user hoici(hoici )
2015-09-20 03:10:29 [I] Completed 200 OK in 16763ms (Views: 2054.7ms | ActiveRecord: 285.6ms)
2015-09-20 03:11:31 [I] Processing by ForemanTasks::Api::TasksController#bulk_search as */*
2015-09-20 03:11:31 [I]   Parameters: {"per_page"=>9999, "searches"=>[{"type"=>"task", "task_id"=>"88f37ece-15bd-488a-a2
2015-09-20 03:11:32 [I] Authorized user hoici(hoici )
2015-09-20 03:11:48 [I] Completed 200 OK in 17146ms (Views: 2212.8ms | ActiveRecord: 272.3ms)
2015-09-20 03:12:51 [I] Processing by ForemanTasks::Api::TasksController#bulk_search as */*
2015-09-20 03:12:51 [I]   Parameters: {"per_page"=>9999, "searches"=>[{"type"=>"task", "task_id"=>"88f37ece-15bd-488a-a2
2015-09-20 03:12:51 [I] Authorized user hoici(hoici )
2015-09-20 03:13:08 [I] Completed 200 OK in 17382ms (Views: 1854.5ms | ActiveRecord: 290.9ms)

[crash] root@li-lc-1578:/var/log/foreman# grep -v '^/' /var/log/foreman/production.log-20150921 | head -n20
[ERROR 2015-09-20 03:14:03 pulp_rest  #2960] RestClient.post "https://li-lc-1578.hag.hilti.com/pulp/api/v2/content/units/rpm/search/", 4236 byte(s) length, "Accept"=>"*/*; q=0.5, application/xml", "Accept-Encoding"=>"gzip, deflate", "Authorization"=>"OAuth oauth_body_hash=\"2jmj7l5rSw0yVb%2FvlWAYkK%2FYBwk%3D\", oauth_consumer_key=\"katello\", oauth_nonce=\"zDqfEqvwKTnWqlaTqi8sF3QrqY5XWN6jw7GyDfpxM4\", oauth_signature=\"GI13syno0e5DLg5o%2FJEWXDD5AA4%3D\", oauth_signature_method=\"HMAC-SHA1\", oauth_timestamp=\"1442718842\", oauth_version=\"1.0\"", "Content-Length"=>"4236", "accept"=>"application/json", "content_type"=>"application/json", "pulp-user"=>"hoici-59c70e2b"
 |
[ERROR 2015-09-20 03:14:03 pulp_rest  #2960] RestClient.post "https://li-lc-1578.hag.hilti.com/pulp/api/v2/content/units/rpm/search/", 4236 byte(s) length, "Accept"=>"*/*; q=0.5, application/xml", "Accept-Encoding"=>"gzip, deflate", "Authorization"=>"OAuth oauth_body_hash=\"2jmj7l5rSw0yVb%2FvlWAYkK%2FYBwk%3D\", oauth_consumer_key=\"katello\", oauth_nonce=\"zDqfEqvwKTnWqlaTqi8sF3QrqY5XWN6jw7GyDfpxM4\", oauth_signature=\"GI13syno0e5DLg5o%2FJEWXDD5AA4%3D\", oauth_signature_method=\"HMAC-SHA1\", oauth_timestamp=\"1442718842\", oauth_version=\"1.0\"", "Content-Length"=>"4236", "accept"=>"application/json", "content_type"=>"application/json", "pulp-user"=>"hoici-59c70e2b"
 |
2015-09-20 03:14:03 [E] Server broke connection (RestClient::ServerBrokeConnection)
[ERROR 2015-09-20 03:14:03 pulp_rest  #2960] RestClient.post "https://li-lc-1578.hag.hilti.com/pulp/api/v2/content/units/rpm/search/", 4236 byte(s) length, "Accept"=>"*/*; q=0.5, application/xml", "Accept-Encoding"=>"gzip, deflate", "Authorization"=>"OAuth oauth_body_hash=\"2jmj7l5rSw0yVb%2FvlWAYkK%2FYBwk%3D\", oauth_consumer_key=\"katello\", oauth_nonce=\"zDqfEqvwKTnWqlaTqi8sF3QrqY5XWN6jw7GyDfpxM4\", oauth_signature=\"GI13syno0e5DLg5o%2FJEWXDD5AA4%3D\", oauth_signature_method=\"HMAC-SHA1\", oauth_timestamp=\"1442718842\", oauth_version=\"1.0\"", "Content-Length"=>"4236", "accept"=>"application/json", "content_type"=>"application/json", "pulp-user"=>"hoici-59c70e2b"
 |
[ERROR 2015-09-20 03:14:03 pulp_rest  #2960] RestClient.post "https://li-lc-1578.hag.hilti.com/pulp/api/v2/content/units/rpm/search/", 4236 byte(s) length, "Accept"=>"*/*; q=0.5, application/xml", "Accept-Encoding"=>"gzip, deflate", "Authorization"=>"OAuth oauth_body_hash=\"2jmj7l5rSw0yVb%2FvlWAYkK%2FYBwk%3D\", oauth_consumer_key=\"katello\", oauth_nonce=\"zDqfEqvwKTnWqlaTqi8sF3QrqY5XWN6jw7GyDfpxM4\", oauth_signature=\"GI13syno0e5DLg5o%2FJEWXDD5AA4%3D\", oauth_signature_method=\"HMAC-SHA1\", oauth_timestamp=\"1442718842\", oauth_version=\"1.0\"", "Content-Length"=>"4236", "accept"=>"application/json", "content_type"=>"application/json", "pulp-user"=>"hoici-59c70e2b"
 |
[ERROR 2015-09-20 03:14:03 pulp_rest  #2960] RestClient.post "https://li-lc-1578.hag.hilti.com/pulp/api/v2/content/units/rpm/search/", 4236 byte(s) length, "Accept"=>"*/*; q=0.5, application/xml", "Accept-Encoding"=>"gzip, deflate", "Authorization"=>"OAuth oauth_body_hash=\"2jmj7l5rSw0yVb%2FvlWAYkK%2FYBwk%3D\", oauth_consumer_key=\"katello\", oauth_nonce=\"zDqfEqvwKTnWqlaTqi8sF3QrqY5XWN6jw7GyDfpxM4\", oauth_signature=\"GI13syno0e5DLg5o%2FJEWXDD5AA4%3D\", oauth_signature_method=\"HMAC-SHA1\", oauth_timestamp=\"1442718842\", oauth_version=\"1.0\"", "Content-Length"=>"4236", "accept"=>"application/json", "content_type"=>"application/json", "pulp-user"=>"hoici-59c70e2b"
 |
2015-09-20 03:14:03 [E] Server broke connection (RestClient::ServerBrokeConnection)
2015-09-20 03:14:03 [E] Server broke connection (RestClient::ServerBrokeConnection)
2015-09-20 03:14:03 [E] Server broke connection (RestClient::ServerBrokeConnection)
2015-09-20 03:14:03 [E] Server broke connection (RestClient::ServerBrokeConnection)
2015-09-20 03:14:03 [I] Client disconnected.
2015-09-20 03:14:03 [I] Client disconnected.
2015-09-20 03:14:03 [I] Client disconnected.
2015-09-20 03:14:16 [W] message could not be sent Dynflow::Listeners::Serialization::Protocol::Done[request_id: 94] because Broken pipe
2015-09-20 03:14:18 [W] message could not be sent Dynflow::Listeners::Serialization::Protocol::Done[request_id: 166] because Broken pipe


[crash] root@li-lc-1578:/var/log# tail -n10 cron-20150920
Sep 20 03:01:01 li-lc-1578 CROND[4018]: (root) CMD (run-parts /etc/cron.hourly)
Sep 20 03:01:01 li-lc-1578 run-parts(/etc/cron.hourly)[4018]: starting 0anacron
Sep 20 03:01:02 li-lc-1578 anacron[4030]: Anacron started on 2015-09-20
Sep 20 03:01:02 li-lc-1578 anacron[4030]: Will run job `cron.daily' in 13 min.
Sep 20 03:01:02 li-lc-1578 anacron[4030]: Jobs will be executed sequentially
Sep 20 03:01:02 li-lc-1578 run-parts(/etc/cron.hourly)[4032]: finished 0anacron
Sep 20 03:05:01 li-lc-1578 CROND[5658]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Sep 20 03:10:01 li-lc-1578 CROND[7633]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Sep 20 03:14:02 li-lc-1578 anacron[4030]: Job `cron.daily' started
Sep 20 03:14:02 li-lc-1578 run-parts(/etc/cron.daily)[9498]: starting logrotate

[crash] root@li-lc-1578:/var/log# head -n20 cron-20150927
Sep 20 03:15:01 li-lc-1578 CROND[10115]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Sep 20 03:18:24 li-lc-1578 run-parts(/etc/cron.daily)[11109]: finished logrotate
Sep 20 03:18:24 li-lc-1578 run-parts(/etc/cron.daily)[9498]: starting makewhatis.cron
Sep 20 03:18:26 li-lc-1578 run-parts(/etc/cron.daily)[11272]: finished makewhatis.cron
Sep 20 03:18:26 li-lc-1578 run-parts(/etc/cron.daily)[9498]: starting mlocate.cron
Sep 20 03:18:52 li-lc-1578 run-parts(/etc/cron.daily)[11574]: finished mlocate.cron
Sep 20 03:18:52 li-lc-1578 run-parts(/etc/cron.daily)[9498]: starting prelink
Sep 20 03:18:52 li-lc-1578 run-parts(/etc/cron.daily)[11587]: finished prelink
Sep 20 03:18:52 li-lc-1578 run-parts(/etc/cron.daily)[9498]: starting readahead.cron
Sep 20 03:18:52 li-lc-1578 run-parts(/etc/cron.daily)[11599]: finished readahead.cron
Sep 20 03:18:52 li-lc-1578 run-parts(/etc/cron.daily)[9498]: starting rhsmd
Sep 20 03:18:55 li-lc-1578 run-parts(/etc/cron.daily)[11609]: finished rhsmd
Sep 20 03:18:55 li-lc-1578 run-parts(/etc/cron.daily)[9498]: starting tmpwatch
Sep 20 03:18:55 li-lc-1578 run-parts(/etc/cron.daily)[11648]: finished tmpwatch
Sep 20 03:18:55 li-lc-1578 anacron[4030]: Job `cron.daily' terminated
Sep 20 03:18:55 li-lc-1578 anacron[4030]: Normal exit (1 job run)
Sep 20 03:20:02 li-lc-1578 CROND[12389]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Sep 20 03:25:01 li-lc-1578 CROND[14994]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Sep 20 03:30:01 li-lc-1578 CROND[16073]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Sep 20 03:30:01 li-lc-1578 CROND[16074]: (foreman) CMD (   /usr/sbin/foreman-rake trends:counter >>/var/log/foreman/cron.log 2>&1)


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Run ContentManagement process continuously
2. Wait for weekly logrotation
3.

Actual results:
Satellite stops responding

Expected results:
Satellite continues to work

Additional info:

Comment 1 Peter Vreman 2015-10-05 08:29:16 UTC
Simple reproduction:
1. Start syncing all RedHat repositories (we have 50+ of them)
2. Execute 'logrotate -f -v /etc/logrotate.d'

Comment 6 Peter Vreman 2015-11-10 16:52:06 UTC
Confirmed that Sat6.1.3 still has this issue.

Reproduction:
1. Start syncing all RedHat repositories (we have 50+ of them)
2. Wait 5 minutes so syncs are really all working
3. Execute logrotate postscript command for mongodb:
   /bin/kill -USR1 `cat /var/run/mongodb/mongodb.pid 2>/dev/null` 2> /dev/null|| true

Comment 7 Peter Vreman 2015-11-10 17:10:51 UTC
Expectation is that logrotation is silent as it shall only rotate log files and does impact any other transactions and therefor it shall never trigger any errors in any component.

Comment 10 Stephen Benjamin 2015-12-04 15:07:53 UTC
I can't reproduce the problem with mongo when I rotate only mongo:

  logrotate -f -v /etc/logrotate.d/mongodb

Satellite 6.1.x logs to the wrong place.  By default, things are going to /var/lib/mongodb/mongodb.log instead of /var/log (fixed in 6.2), but logrotate will rotate /var/log/mongodb/mongodb.log, so unless you manually changed something I can't see that being the cause.

I can cause the syncs to stall by rotating everything:

  logrotate -f -v /etc/logrotate.d

And I narrowed it down to apache, which gets reloaded. If I start a sync of a bunch of repos and wait for the task to begin downloading, and then run:

  logrotate -f -v /etc/logrotate.d/httpd

You'll see httpd logs get rotated, and in postrotate we reload the service.

That causes all the apps to get reloaded too, Rails and Pulp. Rails seems to recover fine, and pulp also gets reloaded, but it never picks up the task again so the sync will stay stuck forever.


Does that sound like what you're seeing Peter?

@Michael - This does seem to be a pulp issue, if it can't recover from httpd service being reloaded.

Comment 11 Peter Vreman 2015-12-04 15:17:14 UTC
See Comment 6 how to reproduce without logrotate, because logrotate does not rotate more than once per day

Reproduction:
1. Start syncing all RedHat repositories (we have 50+ of them)
2. Wait 5 minutes so syncs are really all working
3. Execute logrotate postscript command for mongodb:
   /bin/kill -USR1 `cat /var/run/mongodb/mongodb.pid 2>/dev/null` 2> /dev/null|| true

Comment 12 Peter Vreman 2015-12-04 15:23:16 UTC
Regarding the tasks stuck forever is also have seen it, it is mentioned in the BZ under See Also BZ1264330.
Even restarting the repo syncs after cancelling will not recover. I have the problem with both YUM and Puppet repos.

Maybe there is also a relation to BZ1279502.

Comment 13 Stephen Benjamin 2015-12-04 17:30:01 UTC
The logrotate config for mongo and where mongo is actually logging aren't even the same place, so I can't see how it'd cause the problem because we're not touching the mongo log at all - unless you changed it?

But logrotate -f will indeed rotate many times a day, it forces the rotation if there's new any entries.

Regardless, I did read comment #6, and I tried it, but sending SIGUSR1 manually to mongo didn't do anything.  My syncs continued and the Satellite was always responsive.

Can you attach an sosreport to the case and I can see I can find anything in there?

Comment 14 Michael Hrivnak 2015-12-09 14:48:42 UTC
Since rsyslogd is the process that does the log file writing for pulp, I don't think you should need to restart pulp worker processes during log rotation. I also don't see any evidence in the logrotate files that pulp workers get restarted, so let's assume those processes keep running. Those are the processes that work on tasks such as sync.

Reloading apache should not have an impact on the worker processes that do tasks like sync and publish. Just to be sure, I tried with upstream pulp just now:

1) start a sync task that will take a few minutes to run
2) stop apache
3) watch the logs and see that the task completes successfully

Can you provide more information about what you think pulp is doing wrong? Perhaps logs from pulp, especially if you see an error? Or a task report showing the task state as pulp is reporting it?

The errors I see above look like katello failing to connect to pulp's REST API, which is an expected circumstance if apache is being restarted.

Comment 15 Peter Vreman 2015-12-09 16:36:45 UTC
I am using RHEL6.7

Comment 16 Michael Hrivnak 2015-12-10 19:31:44 UTC
I tried to reproduce with pulp 2.6.5 upstream and could not. I started a long-running sync, then I restarted httpd several times while it ran. I also tried stopping httpd and leaving it off for a while. The worker kept syncing and completed successfully.

I also tried running "logrotate -f -v httpd" during the sync, which did cause httpd to restart, but did not affect the sync.

Comment 17 Stephen Benjamin 2015-12-14 14:30:46 UTC
@Michael - I can reproduce it on el7, satellite 6.1.4. But anyway, Peter said it's not the issue he's seeing, so I derailed the conversation a bit.  I'll open a separate bug for this, as his problem is somehow related to mongo.


@Peter - Can you upload an sosreport to the case so I can look at the logs?

Thanks!

Comment 18 Peter Vreman 2015-12-14 15:13:58 UTC
Sorry, the problem is also for me related to logrotate of httpd and not the mongodb.

I tried the mongodb signal and nothing happened.

But with the logroate of httpd i could reproduce the issue:

[crash] root@li-lc-1578:~# history | grep httpd
  513  2015-12-14 15:10:12: logrotate -f -v /etc/logrotate.d/httpd
  514  2015-12-14 15:12:01: history | grep httpd

In the /var/log/foreman/production.log there is:

2015-12-14 15:10:14 [I] Client disconnected.
2015-12-14 15:10:14 [I] Client disconnected.
2015-12-14 15:10:14 [I] Client disconnected.
2015-12-14 15:10:14 [I] Client disconnected.
2015-12-14 15:10:14 [I] Client disconnected.
2015-12-14 15:10:37 [I] Connecting to database specified by database.yml
2015-12-14 15:10:59 [W] Creating scope :completer_scope. Overwriting existing method Location.completer_scope.
2015-12-14 15:11:00 [W] Creating scope :completer_scope. Overwriting existing method Organization.completer_scope.
2015-12-14 15:11:07 [W] Creating scope :latest. Overwriting existing method Scaptimony::ArfReport.latest.
2015-12-14 15:11:11 [W] message could not be sent Dynflow::Listeners::Serialization::Protocol::Done[request_id: 62] because Broken pipe
2015-12-14 15:11:32 [W] message could not be sent Dynflow::Listeners::Serialization::Protocol::Done[request_id: 66] because Broken pipe

Comment 19 Peter Vreman 2015-12-14 15:21:15 UTC
Uploaded sosreport to case 01517816

Comment 26 Bryan Kearney 2016-11-15 20:46:30 UTC
There were several fixes to logroation in 6.2. Peter, are you still seeing this on 6.2? We have not been able to reproduce this.

Comment 27 Peter Vreman 2016-11-17 16:16:43 UTC
Bryan, i have not faced any issues with 6.2 during logrotation

Comment 28 Bryan Kearney 2016-11-17 16:18:47 UTC
Thank you Peter. I am closing this out as fixed in 6.2 based on Comment 27. If folks still see issues, please feel free to re-open or create a new BZ.


Note You need to log in before you can comment on or make changes to this bug.