Description of problem: I'm getting 503 Service Unavailable from it. Might have been after I've uploaded a rather large patchset: remote: New Changes: remote: https://review.gluster.org/#/c/glusterfs/+/20894 glfs-fops.c, glfs.c: strncpy() -> sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20895 {cli-cmd-parser|cli-rpc-ops||cli-xml-output}.c: strncpy()->sprintf(), reduce ... remote: https://review.gluster.org/#/c/glusterfs/+/20896 {mount-common|fusermount|mount_darwin|umountd}.c: strncpy()->sprintf(), ... remote: https://review.gluster.org/#/c/glusterfs/+/20897 extras/geo-rep/gsync-sync-gfid.c: move from strlen() to sizeof() remote: https://review.gluster.org/#/c/glusterfs/+/20898 multiple files: move from strlen() to sizeof() remote: https://review.gluster.org/#/c/glusterfs/+/20899 multiple files: move from strlen() to sizeof() remote: https://review.gluster.org/#/c/glusterfs/+/20900 bit-rot xlator: strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20901 changelog xlator: strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20902 changetimerecoder xlator: strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20903 xlators: move from strlen() to sizeof() remote: https://review.gluster.org/#/c/glusterfs/+/20904 NFS server (mount3.c, nfs-inodes.c): strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20905 multiple xlators: move from strlen() to sizeof() remote: https://review.gluster.org/#/c/glusterfs/+/20906 multiple xlators: strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20907 multiple xlators (mgmt): strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20908 multiple xlators (storage/posix): strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20909 Various files: strncpy()->sprintf(), reduce strlen()'s remote: remote: Pushing to refs/publish/* is deprecated, use refs/for/* instead. To ssh://review.gluster.org/glusterfs * [new branch] HEAD -> refs/publish/master/remove_strncpy2
Encountering the same problem: Service Unavailable The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.
Looking at it.
So Nigel did restart Gerrit, and this seems to be working now.
(In reply to M. Scherer from comment #3) > So Nigel did restart Gerrit, and this seems to be working now. Any RCA? My commits are not there. Should I re-submit? Now?
From gerrit's sshd_log: [2018-08-22 19:07:52,728 +0000] 78847570 mykaul a/1001977 LOGIN FROM 127.0.0.1 [2018-08-22 19:08:07,718 +0000] 78847570 mykaul a/1001977 git-upload-pack./glusterfs 2ms 14372ms 0 [2018-08-22 19:08:08,088 +0000] 78847570 mykaul a/1001977 LOGOUT [2018-08-22 19:08:15,459 +0000] 1873f98b mykaul a/1001977 LOGIN FROM 127.0.0.1 [2018-08-22 19:08:21,644 +0000] 1873f98b mykaul a/1001977 git-receive-pack./glusterfs 2ms 5568ms 0 git/2.17.1 [2018-08-22 19:08:21,845 +0000] 1873f98b mykaul a/1001977 LOGOUT From /var/log/messages Aug 22 19:09:05 gerrit-new kernel: git invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 . . . . Aug 22 19:09:07 gerrit-new kernel: Out of memory: Kill process 11061 (java) score 388 or sacrifice child Aug 22 19:09:07 gerrit-new kernel: Killed process 11061 (java) total-vm:3814660kB, anon-rss:1503912kB, file-rss:0kB, shmem-rss:0kB It looks like pushing so many patches in one go triggered Gerrit and git to consume large amounts of memory. This lead to Gerrit being OOM killed. It looks like we don't have an swap space on this box. I've just added 1G of swap to reduce a chance this happens next time. Yaniv, can you try pushing again?
Pushed - seems to be OK - https://review.gluster.org/#/q/status:open+project:glusterfs+branch:master+topic:remove_strncpy2
Alright, closing this bug as fixed. Both the issue, and the problems that caused the issue. If we get OOM killed again, the next course of action is to increase the RAM available on this box. Currently we're at 4G, and we may just need 6 G.
Note, Gerrit is very slow. Commands take a lot of time. The UI also seems sluggish.
If swap is added, can it be also added in ansible ?
It died again.
Do you really want to submit the above commits? Type 'yes' to confirm, other to cancel: yes remote: remote: Processing changes: (\) remote: Processing changes: updated: 15 (|) remote: Processing changes: updated: 15 (/) remote: Processing changes: updated: 15 (-) remote: Processing changes: updated: 15 (\) remote: Processing changes: updated: 15 (|) remote: Processing changes: updated: 15 (/) remote: Processing changes: updated: 15 (-) remote: Processing changes: updated: 15 (-) remote: Processing changes: updated: 15, done remote: (W) 6b1e5f8: commit subject >50 characters; use shorter first paragraph remote: (W) 67bda53: commit subject >50 characters; use shorter first paragraph remote: (W) 6ab7621: commit subject >50 characters; use shorter first paragraph remote: (W) 3064b37: commit subject >50 characters; use shorter first paragraph remote: (W) f73257c: commit subject >50 characters; use shorter first paragraph remote: (W) 41d1f6a: commit subject >50 characters; use shorter first paragraph remote: (W) 2ed19f0: commit subject >50 characters; use shorter first paragraph remote: (W) 7125acf: commit subject >50 characters; use shorter first paragraph remote: (W) fe85ce3: commit subject >50 characters; use shorter first paragraph remote: (W) 2254eed: commit subject >50 characters; use shorter first paragraph remote: (W) d15ba41: commit subject >50 characters; use shorter first paragraph remote: remote: Updated Changes: remote: https://review.gluster.org/#/c/glusterfs/+/20919 {cli-cmd-parser|cli-rpc-ops||cli-xml-output}.c: strncpy()->sprintf(), reduce ... remote: https://review.gluster.org/#/c/glusterfs/+/20920 {mount-common|fusermount|mount_darwin|umountd}.c: strncpy()->sprintf(), ... remote: https://review.gluster.org/#/c/glusterfs/+/20921 extras/geo-rep/gsync-sync-gfid.c: move from strlen() to sizeof() remote: https://review.gluster.org/#/c/glusterfs/+/20922 multiple files: move from strlen() to sizeof() remote: https://review.gluster.org/#/c/glusterfs/+/20923 multiple files: move from strlen() to sizeof() remote: https://review.gluster.org/#/c/glusterfs/+/20924 bit-rot xlator: strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20925 changelog xlator: strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20926 changetimerecoder xlator: strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20927 xlators: move from strlen() to sizeof() remote: https://review.gluster.org/#/c/glusterfs/+/20928 NFS server (mount3.c, nfs-inodes.c): strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20929 multiple xlators: move from strlen() to sizeof() remote: https://review.gluster.org/#/c/glusterfs/+/20930 multiple xlators: strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20931 multiple xlators (mgmt): strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20932 multiple xlators (storage/posix): strncpy()->sprintf(), reduce strlen()'s remote: https://review.gluster.org/#/c/glusterfs/+/20933 Various files: strncpy()->sprintf(), reduce strlen()'s remote: remote: Pushing to refs/publish/* is deprecated, use refs/for/* instead. To ssh://review.gluster.org/glusterfs * [new branch] HEAD -> refs/publish/master/remove_strncpy2
Alright, so 1 GB of swap isn't enough. Michael, can you give the VM 4 more GB of RAM? Please add another 10 Gig of disk space as well so we can have a larger swap partition. For each patch, there's a git process started, there's at least one email sent out, and Gerrit triggers 5 to 6 Jenkins jobs for smoke. When this is done all at once for 10+ patches, this consumes quite a bit of memory. I remember we used to give Gerrit a lot of RAM and consciously cut it down since it was using it all the time.
It would need a reboot for that.
Also, I can't increase the root partition for some reason, and I need to figure why, cause lvm do say there is enough space. I did changed the configuration to have max 8G of ram, so a reboot (not just reboot from inside the system, reboot from outside, so destroy/start of the VM in virsh) is needed.
After the increase in RAM, this seems to work way better. Going to close the bug.