RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 963942 - rsyslog doesn't clear $WorkDirectory for disk-assisted queue when remote rsyslog server comes back online.
Summary: rsyslog doesn't clear $WorkDirectory for disk-assisted queue when remote rsys...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: rsyslog
Version: 6.4
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Tomas Heinrich
QA Contact: Karel Srot
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-05-16 20:20 UTC by sean.edge
Modified: 2018-12-05 16:00 UTC (History)
10 users (show)

Fixed In Version: rsyslog-5.8.10-7.el6
Doc Type: Bug Fix
Doc Text:
Cause: Wrong data type of a variable holding the spool file size limit. Consequence: This could have resulted in message loss and the intended size limit not being respected. Fix: The package was patched and the data type corrected. Result: Correct spool files should be created and messages resent as expected.
Clone Of:
Environment:
Last Closed: 2013-11-21 23:40:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Full /etc/rsyslog.conf of client (11.14 KB, application/octet-stream)
2013-05-16 20:20 UTC, sean.edge
no flags Details
correction to the stream obj's interface (1.86 KB, patch)
2013-07-18 13:01 UTC, Tomas Heinrich
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:1716 0 normal SHIPPED_LIVE rsyslog bug fix update 2013-11-20 21:51:43 UTC

Description sean.edge 2013-05-16 20:20:56 UTC
Created attachment 749044 [details]
Full /etc/rsyslog.conf of client

Description of problem:
Using disk-assisted $ActionQueue (type=LinkedList) for forwarding messages to remote rsyslog server, via TCP.  If the logserver has been down for a period of time, messages that were backed to disk because of various factors (rsyslog restart, in memory queue is full, etc) never get reprocessed and sent to the remote rsyslog server.

Version-Release number of selected component (if applicable):
[root@n5 ~]# rpm -q rsyslog
rsyslog-5.8.10-6.el6.x86_64


How reproducible:
Easily

Steps to Reproduce:
1. Configure rsyslog using in-memory, disk-assisted forwarding queue
2. Take down remote log server for a period of time, I waited a couple days for it to use all the inodes on a 15GB system.
3. Start remote logging server
4. Monitor $WorkDirectory on original server.  Files never get cleaned up.
  
Actual results:
Configured $WorkDirectory never gets cleaned of old log messages even if there's a .qi (QueueIndex) file that points to legitimate syslog message files.  This can be problematic because if a lot of messages get backed up it isn't very difficult to exhaust all the available inodes on the filesystem.

Expected results:
When the remote log server comes back online, I would expect rsyslog to replay the disk backed messages to the remote log server and clean the configured $WorkDirectory. 

Additional info:
Relevent bits of /etc/rsyslog.conf on client servers:
$WorkDirectory /var/spool/rsyslog_spool

$ActionQueueType LinkedList
$ActionQueueFileName aqfwd
$ActionQueueSize 150000
$ActionQueueLowWaterMark 30000          
$ActionQueueHighWaterMark 120000        
$ActionQueueSaveOnShutdown on
$ActionQueueDiscardSeverity 7           
$ActionQueueMaxFileSize 10g             
$ActionResumeRetryCount -1
*.* @@logserver:10514;RSYSLOG_ForwardFormat



All inodes exhaused with awfwd msgs in /var/spool/rsyslog_spool
[root@n5 ~]# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/mapper/myvg-rootvol
                      971040  971040       0  100% /
tmpfs                1007547       1 1007546    1% /dev/shm
/dev/sda1              51200      51   51149    1% /boot

[root@n5 ~]# find /var/spool/rsyslog_spool -name 'aqfwd.*' | wc -l
858883

Comment 4 sean.edge 2013-05-20 13:38:14 UTC
QueueIndex file contents:
[root@n5 ~]# cat /var/spool/rsyslog_spool/aqfwd.qi
<OPB:1:qqueue:1:
+iQueueSize:2:3:621:
+tVars.disk.sizeOnDisk:2:6:348756:
+tVars.disk.bytesRead:2:1:0:
>End
.
<Obj:1:strm:1:
+iCurrFNum:2:1:1:
+pszFName:1:5:aqfwd:
+iMaxFiles:2:8:10000000:
+bDeleteOnClose:2:1:0:
+sType:2:1:1:
+tOperationsMode:2:1:2:
+tOpenMode:2:3:384:
+iCurrOffs:2:1:0:
>End
.
<Obj:1:strm:1:
+iCurrFNum:2:1:1:
+pszFName:1:5:aqfwd:
+iMaxFiles:2:8:10000000:
+bDeleteOnClose:2:1:1:
+sType:2:1:1:
+tOperationsMode:2:1:1:
+tOpenMode:2:3:384:
+iCurrOffs:2:1:0:
>End
.

Syslog message the QI file points to:
[root@n5 ~]# cat /var/spool/rsyslog_spool/aqfwd.00000001
<Obj:1:msg:1:
+iProtocolVersion:2:1:0:
+iSeverity:2:1:5:
+iFacility:2:2:16:
+msgFlags:2:1:0:
+ttGenTime:2:10:1368676097:
+tRcvdAt:3:34:2:2013:5:16:3:48:17:881464:6:+:0:0:
+tTIMESTAMP:3:34:2:2013:5:16:3:48:17:881464:6:+:0:0:
+pszTAG:1:5:audit:
+pszRawMsg:1:115:94): item=0 name="/tmp/" inode=652801 dev=fd:00 mode=041777 ouid=0 ogid=0 rdev=00:00 obj=system_u:object_r:tmp_t:s0:
+pszHOSTNAME:1:2:n5:
+pszInputName:1:6:imfile:
+pszRcvFrom:1:0::
+pszRcvFromIP:1:0::
+offMSG:2:1:0:
>End
.
s0 tclass=tcp_socket:
+pszHOSTNAME:1:2:n5:
+pszInputName:1:6:imfile:
+pszRcvFrom:1:0::
+pszRcvFromIP:1:0::
+offMSG:2:1:0:
>End
.

Comment 7 Soham Chakraborty 2013-06-12 14:49:05 UTC
Few things which will be needed here. In the order of requirement and data collection.

1) Start a loop of logger command on the client. Something as simple as this


for i in {1..100}; do logger $i "Test"; done

Increase value of i if needed and insert sleep

2) Ideally $i Test will appear on remote log server.

3) Take down the remote log server. Stop rsyslog.

4) Now, the logger loop is still running and the on disk queue in disk is being created and growing.

5) Start a tcpdump at both client and server.

tcpdump -s0 -i ethX -w /tmp/from_client.pcap host <server-ip>
tcpdump -s0 -i ethX -w /tmp/from_server.pcap host <client-ip>

6) Start rsyslog in debug mode on server.

pkill rsyslog
/sbin/rsyslogd -c5 -dn > logfile

Let it run for around 5 minutes. In problematic circumstance, $i Test will not come in remote log server.

7) Do a manual logger "This is manual test" from client. Now, see whether this logger message reaches server or not.

8) Ctrl-c the rsyslog in debug mode in server.

9) Restart rsyslog normally on server.

We will need to check the tcpdump captures and the logfile and the /var/log/messages from server.

Comment 9 Karel Srot 2013-06-13 07:48:10 UTC
I am not able to reproduce the bug. In my case messages are properly delivered and the queue files are deleted, although I didn't let the collector down for such a long time.

According to the rsyslog documentation (http://www.rsyslog.com/doc/rsconf1_actionresumeinterval.html) the resume interval is (numRetries / 10 + 1) * $ActionResumeInterval. So after the 10th try, it by default is 60 and after the 100th try it is 330. Is it possible that the reporter actually didn't give the client enough time to contact the server? If the server was down for several days it is possible that the resume interval was even more than ten minutes.


Soham,
do you observe a message loss? What rsyslog.conf do you use?

Comment 11 sean.edge 2013-06-13 13:37:52 UTC
Here is the relevant section of my config again, though I believe I attached the whole thing to this bug:

# Forward messages to central log server.
$ActionQueueType LinkedList
$ActionQueueFileName aqfwd
$ActionQueueSize 1000
$ActionQueueLowWaterMark 200            
$ActionQueueHighWaterMark 800   
$ActionQueueSaveOnShutdown on
#$ActionQueueDiscardSeverity 7          # Dont discard messages.
$ActionQueueDiscardMark 975
$ActionQueueMaxFileSize 1m              
$ActionResumeRetryCount -1              
$ActionQueueMaxDiskSpace 1g
*.* @@logserver:10514;RSYSLOG_ForwardFormat 


Karel,  If you look at the support case you can see that it's been about a week since I did a similar test to what Soham is asking for.  The spool files are still sitting there.

Soham,  I'll try to run your test some time today.

Comment 12 sean.edge 2013-06-13 13:56:55 UTC
> 
> 5) Start a tcpdump at both client and server.
> 
> tcpdump -s0 -i ethX -w /tmp/from_client.pcap host <server-ip>
> tcpdump -s0 -i ethX -w /tmp/from_server.pcap host <client-ip>
> 

I'm pretty sure from_client should have the client ip in this tcpdump line, right?

Comment 13 sean.edge 2013-06-13 14:01:22 UTC
Created attachment 760707 [details]
from_server.pcap

Comment 14 sean.edge 2013-06-13 14:02:00 UTC
Created attachment 760708 [details]
from_client.pcap

Comment 15 sean.edge 2013-06-13 14:10:43 UTC
Soham,
I did what you asked.  For #7, yes, the manual message reaches the logserver.


My spool directory continues to grow:

-rw-------. 1 root root     392 Jun  7 14:09 aqfwd.00015436
-rw-------. 1 root root 1048690 Jun  7 14:12 aqfwd.00015442
-rw-------. 1 root root 1048856 Jun  7 14:13 aqfwd.00015443
-rw-------. 1 root root     496 Jun 11 14:18 aqfwd.qi
-rw-------. 1 root root 1048807 Jun 13 13:52 aqfwd.00015444
-rw-------. 1 root root 1048887 Jun 13 13:52 aqfwd.00015445
-rw-------. 1 root root 1048942 Jun 13 13:54 aqfwd.00015446
-rw-------. 1 root root  200056 Jun 13 13:54 aqfwd.00015447


This large gap in test messages is where I had rsyslog down on the server side:

2013-06-13T13:49:33.345911+00:00 n4 root: 1 Test
2013-06-13T13:49:38.352978+00:00 n4 root: 2 Test
2013-06-13T13:49:43.360908+00:00 n4 root: 3 Test
2013-06-13T13:49:48.368667+00:00 n4 root: 4 Test
2013-06-13T13:49:53.376028+00:00 n4 root: 5 Test
2013-06-13T13:49:58.383314+00:00 n4 root: 6 Test
2013-06-13T13:50:03.391075+00:00 n4 root: 7 Test
2013-06-13T13:50:08.398741+00:00 n4 root: 8 Test
2013-06-13T13:50:13.406110+00:00 n4 root: 9 Test
2013-06-13T13:53:38.317058+00:00 n4 root: 6786 Test
2013-06-13T13:53:38.337640+00:00 n4 root: 6787 Test
2013-06-13T13:53:38.357610+00:00 n4 root: 6788 Test
2013-06-13T13:53:38.377921+00:00 n4 root: 6789 Test
2013-06-13T13:53:38.398555+00:00 n4 root: 6790 Test
2013-06-13T13:53:38.420186+00:00 n4 root: 6791 Test
2013-06-13T13:53:38.440441+00:00 n4 root: 6792 Test
2013-06-13T13:53:38.460366+00:00 n4 root: 6793 Test
2013-06-13T13:53:38.479349+00:00 n4 root: 6794 Test
2013-06-13T13:53:38.495890+00:00 n4 root: 6795 Test
2013-06-13T13:53:38.518186+00:00 n4 root: 6796 Test
2013-06-13T13:53:38.540254+00:00 n4 root: 6797 Test
2013-06-13T13:53:38.562547+00:00 n4 root: 6798 Test
2013-06-13T13:53:38.584377+00:00 n4 root: 6799 Test
2013-06-13T13:53:38.605304+00:00 n4 root: 6800 Test
2013-06-13T13:53:38.626186+00:00 n4 root: 6801 Test
2013-06-13T13:53:38.647346+00:00 n4 root: 6802 Test
2013-06-13T13:53:38.668363+00:00 n4 root: 6803 Test
2013-06-13T13:53:38.689164+00:00 n4 root: 6804 Test
2013-06-13T13:53:38.709830+00:00 n4 root: 6805 Test
2013-06-13T13:53:38.730565+00:00 n4 root: 6806 Test


I never get "10 Test" if I look at the logserver's side.

[root@mpoam5-logserver 13]# egrep '\b9 Test' messages
2013-06-13T13:50:13.406110+00:00 n4 root: 9 Test
[root@mpoam5-logserver 13]# egrep '\b10 Test' messages
[root@mpoam5-logserver 13]#

Comment 16 Soham Chakraborty 2013-06-13 14:17:09 UTC
(In reply to sean.edge from comment #12)
> > 
> > 5) Start a tcpdump at both client and server.
> > 
> > tcpdump -s0 -i ethX -w /tmp/from_client.pcap host <server-ip>
> > tcpdump -s0 -i ethX -w /tmp/from_server.pcap host <client-ip>
> > 
> 
> I'm pretty sure from_client should have the client ip in this tcpdump line,
> right?

Sean,

let's take the data to issue tracker. This is a public bug and we don't want tcpdump and debug log to be world visible. Please remove the files here and attach in the support case. I will see there.

If anything on the bz is needed, I will ask accordingly.

Comment 17 sean.edge 2013-06-13 14:30:48 UTC
I don't see any way to delete them.

Comment 18 sean.edge 2013-06-13 14:35:16 UTC
The only action I have available on them is to view them.  Can you remove them?

Comment 19 Soham Chakraborty 2013-06-13 14:38:48 UTC
Sean, Karel has made them private. So, only we will be able to view them. Don't worry.

Comment 20 sean.edge 2013-06-13 14:45:50 UTC
Perfect, thanks!  I'll need to use the dropbox.redhat.com to upload the rsyslog debug output because it's too large.  Keep an eye out for that in the next few minutes.

Comment 21 sean.edge 2013-06-13 14:51:33 UTC
Nevermind, I compressed it and was able to upload it.  It's attached to the support case now.

Comment 24 Karel Srot 2013-06-21 05:32:08 UTC
Tomas,
do you think that rsyslog is able to forward messages store in disk queues even if it used all inodes on the disk? In the description #c0 I do not see whether the logs kept coming even when all inodes were used.

I am curious how much important is this "use all inodes" part and whether the scenario works OK for the reporter if there are free inodes left.

Comment 25 sean.edge 2013-06-21 13:35:38 UTC
Karel,
I tried freeing up some inodes and restarting rsyslog, wondering if perhaps rsyslog needed to write some temporary/work files in order to process the spooled messages.  Unfortunately that didn't help.

Comment 26 Karel Srot 2013-06-25 14:09:02 UTC
(In reply to sean.edge from comment #25)
> Karel,
> I tried freeing up some inodes and restarting rsyslog, wondering if perhaps
> rsyslog needed to write some temporary/work files in order to process the
> spooled messages.  Unfortunately that didn't help.

And is the inodes consumption crucial for the reproducer? Is the scenario (queued log delivery when the server comes back online) working otherwise? Thank you.

Comment 27 sean.edge 2013-06-25 14:14:57 UTC
Karel,
No, it's not crucial for the reproducer.  Was just trying to list that as a side affect.  The real issue is that log messages in the spool directory are NEVER forwarded to the remote logging server, even after it comes on line and various service rsyslog restarts, nor is the spool directory cleared.  Thus creating a condition where we are running out of inodes in our dev environment.  What concerns me most is the messages never being reprocessed.  Hopefully that helps.

Comment 29 Tomas Heinrich 2013-07-16 09:26:25 UTC
Neither I can reproduce the problem.

I've tried taking down the receiver, sending ~500K messages and bringing it up again. After a while, all the messages are sent to the other side.

I've also tried taking down the receiver, sending a couple thousand msgs, stopping rsyslog, starting the receiver and starting rsyslog. Success again.

I've used a slightly altered version of the configuration you'd provided.

Comment 30 sean.edge 2013-07-16 13:42:23 UTC
Tomas,
I was playing around with this a bit yesterday and here's what I saw.  

1) 10g as an ActionQueueMaxFileSize does not work as expected.  This should create up to an approximately 10GB file for queued syslog messages (when queue is full) but instead it creates one file per message.  Changing this value to 1g,50m,10m,1m all do what you would expect.  This was causing me to run out of inodes in my dev environment (VMs with not much disk space).
2) I was monitoring the work directory with 'watch -n2 ls -ltr' while doing these tests.  Once the receiver came back a large number of spool files were cleared (~4K) but the rest were not.  I double checked and those messages were not resent to the receiver.  I'm not sure if I wasn't patient enough or what, but it seemed to stop processing the spooled syslog messages.

I've happily changed my ActionQueueMaxFileSize to 50m and am content with that as a resolution for the time being.

Thanks.

Comment 31 Soham Chakraborty 2013-07-16 15:27:56 UTC
Sean,

Let me try to formalize your last comment. 

You mean, $ActionQueueMaxFileSize, whatever value is set, doesn't work as the name suggests. That is, when the server is down and client is creating spool disk files (supposed to be sent later), it doesn't allow the spool file(s) to grow to a single 10 GiB. But, creates smaller files per message (like - logger "A" and then logger "B"). This will create two log files of the size of "A" and "B". Is this correct?

And when it happens, the filesystem on the client disk, i.e. /var/spool/rsyslog (in /var to be specific) goes short on inode. And we eventually see loss of spooled messages.

Am I correct in postulating the first answer in the correct terms.

Also, about the second answer, I also don't know. Probably your testing is what we can rely.

Also, how ActionQueueMaxFileSize to 50 MiB is helping. I mean, if you generate 50 files of 1 MiB each, then we cull 50 inodes. Is this correct.

Let me know please.

Soham

Comment 32 sean.edge 2013-07-16 17:46:24 UTC
(In reply to Soham Chakraborty from comment #31)
> You mean, $ActionQueueMaxFileSize, whatever value is set, doesn't work as
> the name suggests. That is, when the server is down and client is creating
> spool disk files (supposed to be sent later), it doesn't allow the spool
> file(s) to grow to a single 10 GiB. But, creates smaller files per message
> (like - logger "A" and then logger "B"). This will create two log files of
> the size of "A" and "B". Is this correct?

Yes but only when it's set to 10g.  Other values, that I have previously listed, work as expected and fill a spool file until it reaches the defined max, then rsyslog creates a new one and continues on.


> And when it happens, the filesystem on the client disk, i.e.
> /var/spool/rsyslog (in /var to be specific) goes short on inode. And we
> eventually see loss of spooled messages.

Presumably, yes.  I haven't verified that myself but it's a reasonable assumption.  The messages don't have any place to go.

> Am I correct in postulating the first answer in the correct terms.

I think so.  With $ActionQueueMaxFileSize=10g rsyslog creates one spool file per syslog message.  Using the other values I mentioned, rsyslog creates a spool file, fills it until maxfilesize is met or exceeded, and then it creates a new spool file.

> Also, about the second answer, I also don't know. Probably your testing is
> what we can rely.

I don't know what you mean here.

> Also, how ActionQueueMaxFileSize to 50 MiB is helping. I mean, if you
> generate 50 files of 1 MiB each, then we cull 50 inodes. Is this correct.

I don't completely understand what you wrote here.  Since rsyslog can now put many messages in a single file we are less likely to run out of inodes.  Previously, it was creating one spool file per queued message.  This means I had hundreds of thousands of spool files in my work directory and rsyslog was never clearing them out.

Comment 33 Tomas Heinrich 2013-07-17 16:04:04 UTC
(In reply to sean.edge from comment #32)
> > Also, how ActionQueueMaxFileSize to 50 MiB is helping. I mean, if you
> > generate 50 files of 1 MiB each, then we cull 50 inodes. Is this correct.
> 
> I don't completely understand what you wrote here. <...>

I think there's a misunderstanding on Soham's part that... 

(In reply to Soham Chakraborty from comment #31)
> You mean, $ActionQueueMaxFileSize, whatever value is set, doesn't work as
> the name suggests.

The wrong behavior should manifest only with the 10GB argument, as Sean stated, therefore...

(In reply to sean.edge from comment #30)
> I've happily changed my ActionQueueMaxFileSize to 50m and am content with
> that as a resolution for the time being.

Comment 34 Tomas Heinrich 2013-07-17 16:26:40 UTC
(In reply to sean.edge from comment #30)
> Tomas,
> I was playing around with this a bit yesterday and here's what I saw.  

> 1) 10g as an ActionQueueMaxFileSize does not work as expected.  This should
> create up to an approximately 10GB file for queued syslog messages (when
> queue is full) but instead it creates one file per message.  Changing this
> value to 1g,50m,10m,1m all do what you would expect.  This was causing me to
> run out of inodes in my dev environment (VMs with not much disk space).

Thanks for letting us know, this is a weird misbehavior.

> 2) I was monitoring the work directory with 'watch -n2 ls -ltr' while doing
> these tests.  Once the receiver came back a large number of spool files were
> cleared (~4K) but the rest were not.  I double checked and those messages
> were not resent to the receiver.  I'm not sure if I wasn't patient enough or
> what, but it seemed to stop processing the spooled syslog messages.

The correct behavior should be to resend all the spooled messages when possible and remove all but one spool file.

> I've happily changed my ActionQueueMaxFileSize to 50m and am content with
> that as a resolution for the time being.

Good that you have a workaround, but thanks to your information, I'm now able to reproduce the symptoms. These may even two separate issues, I'm not sure yet. First is the ActionQueueMaxFileSize bug and the second is not attempting to resend the spooled messages.

I came to these rough steps to reproduce the symptoms and weren't able to reproduce it without using the 10g value. I use netcat as a receiver as it's just a simple tcp data dump.

- have a working setup to forward the msgs to netcat
- bring down nc
- send ~5k msgs
- stop rsyslog
- start rsyslog
- bring nc up
- wait for rsyslog to sync the msgs to nc
  at this point, all the msgs should be forwarded
- bring nc down
- send ~5k msgs
- stop rsyslog
- start rsyslog
- bring nc up
  at this point, some msgs are sent but majority is not and the spool files are not erased

I've reproduced this even with 10t, so this might be some value overflow somewhere. Will keep you posted.

Comment 35 sean.edge 2013-07-17 16:46:53 UTC
(In reply to Tomas Heinrich from comment #34)
> Thanks for letting us know, this is a weird misbehavior.

No problem.

> The correct behavior should be to resend all the spooled messages when
> possible and remove all but one spool file.

Agreed and I see the correct behavior when I used the lower maxfilesize values I listed.

> Good that you have a workaround, but thanks to your information, I'm now
> able to reproduce the symptoms. These may even two separate issues, I'm not
> sure yet. First is the ActionQueueMaxFileSize bug and the second is not
> attempting to resend the spooled messages.

I agree about the possibility of two separate issues.  My manager and I had similar suspicion when we talked about it the other day.

> I came to these rough steps to reproduce the symptoms and weren't able to
> reproduce it without using the 10g value. I use netcat as a receiver as it's
> just a simple tcp data dump.
> 
> - have a working setup to forward the msgs to netcat
> - bring down nc
> - send ~5k msgs
> - stop rsyslog
> - start rsyslog
> - bring nc up
> - wait for rsyslog to sync the msgs to nc
>   at this point, all the msgs should be forwarded
> - bring nc down
> - send ~5k msgs
> - stop rsyslog
> - start rsyslog
> - bring nc up
>   at this point, some msgs are sent but majority is not and the spool files
> are not erased
> 
> I've reproduced this even with 10t, so this might be some value overflow
> somewhere. Will keep you posted.

Excellent!  Looking forward to hearing back about this!

Thanks, Tomas!

Comment 36 Soham Chakraborty 2013-07-17 17:36:55 UTC
Tomas, Sean,

Awesome. It is clear now. Yeah, I understand the inode problem more clearly now.

I will see if I can help Tomas in this.

Soham

Comment 37 Tomas Heinrich 2013-07-18 13:01:36 UTC
Created attachment 775306 [details]
correction to the stream obj's interface

Looks like I've found the root cause; bad type in one of the interfaces. Here's a simple patch.

Comment 41 errata-xmlrpc 2013-11-21 23:40:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1716.html

Comment 42 printul77700 2016-01-07 15:35:22 UTC
Hello,
I know this is an old discussion,but I just encountered same problem.
Could you tell me please if there is any way of sending those messages/files from spool to processing so they can get to receiver/consumer?
I expect to be an workaround that can make rsyslog read the spool, or I hope it exists.
The second question is do I neet to have the .qi file?Because in some of my problematic server there is the file but in some of them there is not
Thank you

Comment 43 Sirius Rayner-Karlsson 2016-01-08 07:55:06 UTC
(In reply to printul77700 from comment #42)
> Hello,
> I know this is an old discussion,but I just encountered same problem.
> Could you tell me please if there is any way of sending those messages/files
> from spool to processing so they can get to receiver/consumer?
> I expect to be an workaround that can make rsyslog read the spool, or I hope
> it exists.
> The second question is do I neet to have the .qi file?Because in some of my
> problematic server there is the file but in some of them there is not
> Thank you

Bugzilla is not a support channel, so if you are still experiencing problems after applying the errata (http://rhn.redhat.com/errata/RHBA-2013-1716.html) for the problem described in this Bugzilla - please raise a case through the official support channels at https://access.redhat.com/.

Kind regards,

/Anders


Note You need to log in before you can comment on or make changes to this bug.