Bug 1420118

Summary: [2.2/10.2.5-22.el7cp] few valgrind mds leaks
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vasu Kulkarni <vakulkar>
Component: CephFSAssignee: Patrick Donnelly <pdonnell>
Status: CLOSED WONTFIX QA Contact: Vasu Kulkarni <vakulkar>
Severity: low Docs Contact:
Priority: medium    
Version: 2.2CC: anharris, bniver, ceph-eng-bugs, hnallurv, kchai, pdonnell, vakulkar
Target Milestone: rc   
Target Release: 2.*   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1449787 (view as bug list) Environment:
Last Closed: 2019-01-04 19:39:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vasu Kulkarni 2017-02-07 21:38:57 UTC
Description of problem:

During valgrind test run following leaks were seen, John mentioned that there are known issues with upstream jewel as well and tracking this for downstream once its fixed, I will link it to upstream tracker if there is one.


2017-02-07T03:08:06.594 INFO:teuthology.orchestra.run.clara004:Running: "sudo zgrep '<kind>' /var/log/ceph/valgrind/* /dev/null | sort | uniq"
2017-02-07T03:08:06.600 INFO:teuthology.orchestra.run.clara003:Running: "sudo zgrep '<kind>' /var/log/ceph/valgrind/* /dev/null | sort | uniq"
2017-02-07T03:08:06.689 INFO:teuthology.orchestra.run.clara004.stdout:/var/log/ceph/valgrind/client.0.log:  <kind>Leak_StillReachable</kind>
2017-02-07T03:08:06.690 INFO:teuthology.orchestra.run.clara004.stdout:/var/log/ceph/valgrind/mds.a.log:  <kind>Leak_DefinitelyLost</kind>
2017-02-07T03:08:06.690 INFO:teuthology.orchestra.run.clara004.stdout:/var/log/ceph/valgrind/mon.a.log:  <kind>Leak_StillReachable</kind>
2017-02-07T03:08:06.690 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/client.0.log kind   <kind>Leak_StillReachable</kind>
2017-02-07T03:08:06.691 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/client.0.log
2017-02-07T03:08:06.691 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mds.a.log kind   <kind>Leak_DefinitelyLost</kind>
2017-02-07T03:08:06.691 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mon.a.log kind   <kind>Leak_StillReachable</kind>
2017-02-07T03:08:06.691 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.a.log
2017-02-07T03:08:06.697 INFO:teuthology.orchestra.run.clara003.stdout:/var/log/ceph/valgrind/mds.a-s.log:  <kind>Leak_DefinitelyLost</kind>
2017-02-07T03:08:06.697 INFO:teuthology.orchestra.run.clara003.stdout:/var/log/ceph/valgrind/mon.b.log:  <kind>Leak_StillReachable</kind>
2017-02-07T03:08:06.698 INFO:teuthology.orchestra.run.clara003.stdout:/var/log/ceph/valgrind/mon.c.log:  <kind>Leak_StillReachable</kind>
2017-02-07T03:08:06.698 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mds.a-s.log kind   <kind>Leak_DefinitelyLost</kind>
2017-02-07T03:08:06.698 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mon.b.log kind   <kind>Leak_StillReachable</kind>
2017-02-07T03:08:06.699 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.b.log
2017-02-07T03:08:06.699 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mon.c.log kind   <kind>Leak_StillReachable</kind>
2017-02-07T03:08:06.699 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.c.log


Version-Release number of selected component (if applicable):

10.2.5-22.el7cp

How reproducible:
1/1


Logs:
http://magna002.ceph.redhat.com/vasu-2017-02-06_22:49:33-fs-jewel---basic-multi/264408/teuthology.log

Comment 3 Vasu Kulkarni 2017-02-10 17:44:32 UTC
Kefu,

Do you want to look at the mon leak issue seen during fs or rgw runs, I know there are few known issues but trying to distinguish new from old is becoming difficult with failed cases, Let me know if you need a different bz to track mon issue.

http://magna002.ceph.redhat.com/vasu-2017-02-08_16:40:39-rgw-jewel---basic-multi/264717/teuthology.log

2017-02-08T18:19:10.864 INFO:teuthology.orchestra.run.pluto010:Running: "sudo zgrep '<kind>' /var/log/ceph/valgrind/* /dev/null | sort | uniq"
2017-02-08T18:19:10.869 INFO:teuthology.orchestra.run.pluto008:Running: "sudo zgrep '<kind>' /var/log/ceph/valgrind/* /dev/null | sort | uniq"
2017-02-08T18:19:10.940 INFO:teuthology.orchestra.run.pluto008.stdout:/var/log/ceph/valgrind/mon.b.log:  <kind>Leak_StillReachable</kind>
2017-02-08T18:19:10.942 INFO:teuthology.orchestra.run.pluto010.stdout:/var/log/ceph/valgrind/mon.a.log:  <kind>Leak_StillReachable</kind>
2017-02-08T18:19:10.942 INFO:teuthology.orchestra.run.pluto010.stdout:/var/log/ceph/valgrind/mon.c.log:  <kind>Leak_StillReachable</kind>
2017-02-08T18:19:10.942 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mon.a.log kind   <kind>Leak_StillReachable</kind>
2017-02-08T18:19:10.943 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.a.log
2017-02-08T18:19:10.943 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mon.c.log kind   <kind>Leak_StillReachable</kind>
2017-02-08T18:19:10.943 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.c.log
2017-02-08T18:19:10.943 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mon.b.log kind   <kind>Leak_StillReachable</kind>
2017-02-08T18:19:10.944 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.b.log

Comment 4 Kefu Chai 2017-02-28 12:17:39 UTC
Vasu, the leak reports of mon are the same issue. no need to create a different bz for them. i will take a look at it later on.

Comment 5 Kefu Chai 2017-02-28 12:22:08 UTC
but by inspecting the back trace, the mds leak is a different one.

Comment 9 Vasu Kulkarni 2017-05-10 16:47:49 UTC
I have separated monitor leak to a cloned bz, still seeing mds leaks


2017-05-04T00:41:33.643 INFO:teuthology.orchestra.run.clara002:Running: "sudo zgrep '<kind>' /var/log/ceph/valgrind/* /dev/null | sort | uniq"
2017-05-04T00:41:33.648 INFO:teuthology.orchestra.run.pluto003:Running: "sudo zgrep '<kind>' /var/log/ceph/valgrind/* /dev/null | sort | uniq"
2017-05-04T00:41:33.725 INFO:teuthology.orchestra.run.pluto003.stdout:/var/log/ceph/valgrind/mds.a-s.log:  <kind>Leak_DefinitelyLost</kind>
2017-05-04T00:41:33.725 INFO:teuthology.orchestra.run.pluto003.stdout:/var/log/ceph/valgrind/mon.b.log:  <kind>Leak_StillReachable</kind>
2017-05-04T00:41:33.725 INFO:teuthology.orchestra.run.pluto003.stdout:/var/log/ceph/valgrind/mon.c.log:  <kind>Leak_StillReachable</kind>
2017-05-04T00:41:33.738 INFO:teuthology.orchestra.run.clara002.stdout:/var/log/ceph/valgrind/client.0.log:  <kind>Leak_StillReachable</kind>
2017-05-04T00:41:33.739 INFO:teuthology.orchestra.run.clara002.stdout:/var/log/ceph/valgrind/mds.a.log:  <kind>Leak_DefinitelyLost</kind>
2017-05-04T00:41:33.739 INFO:teuthology.orchestra.run.clara002.stdout:/var/log/ceph/valgrind/mon.a.log:  <kind>Leak_StillReachable</kind>
2017-05-04T00:41:33.739 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/client.0.log kind   <kind>Leak_StillReachable</kind>
2017-05-04T00:41:33.740 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/client.0.log
2017-05-04T00:41:33.740 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mds.a.log kind   <kind>Leak_DefinitelyLost</kind>
2017-05-04T00:41:33.740 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mon.a.log kind   <kind>Leak_StillReachable</kind>
2017-05-04T00:41:33.740 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.a.log
2017-05-04T00:41:33.740 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mds.a-s.log kind   <kind>Leak_DefinitelyLost</kind>
2017-05-04T00:41:33.740 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mon.b.log kind   <kind>Leak_StillReachable</kind>
2017-05-04T00:41:33.741 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.b.log
2017-05-04T00:41:33.741 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mon.c.log kind   <kind>Leak_StillReachable</kind>
2017-05-04T00:41:33.741 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.c.log

http://magna002.ceph.redhat.com/vasu-2017-05-03_20:52:32-fs-jewel---basic-multi/268296/teuthology.log

Comment 10 Patrick Donnelly 2017-06-30 01:05:21 UTC
Hello Vasu, I'm the new PTL for CephFS. Can you tell me where I can find the valgrind logs for these leaks?

Comment 11 Drew Harris 2017-07-11 13:14:50 UTC
This needs to be re-tested. The logs have been lost.

Comment 12 Vasu Kulkarni 2017-07-13 22:31:29 UTC
Hi Patrick,

Sorry I missed your comment, I will rerun this on new build and update the logs link here. I could recreate this multiple times.

Comment 14 Patrick Donnelly 2017-08-21 16:55:33 UTC
Moving this to 3.1. We can't proceed without a log.

Comment 15 Vasu Kulkarni 2017-10-20 18:47:01 UTC
Seeing this in new regression runs at

http://magna002.ceph.redhat.com/vasu-2017-10-18_22:32:22-fs-luminous---basic-multi/278194/teuthology.log


2017-10-19T23:21:37.300 INFO:teuthology.orchestra.run.clara013.stdout:/var/log/ceph/valgrind/mds.a-s.log:  <kind>Leak_DefinitelyLost</kind>
2017-10-19T23:21:37.300 INFO:teuthology.orchestra.run.clara013.stdout:/var/log/ceph/valgrind/mds.a-s.log:  <kind>Leak_PossiblyLost</kind>
2017-10-19T23:21:37.301 INFO:teuthology.orchestra.run.clara013.stdout:/var/log/ceph/valgrind/mon.b.log:  <kind>Leak_StillReachable</kind>
2017-10-19T23:21:37.301 INFO:teuthology.orchestra.run.clara013.stdout:/var/log/ceph/valgrind/mon.c.log:  <kind>Leak_StillReachable</kind>
2017-10-19T23:21:37.301 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mds.a-s.log kind   <kind>Leak_DefinitelyLost</kind>
2017-10-19T23:21:37.301 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mds.a-s.log kind   <kind>Leak_PossiblyLost</kind>
2017-10-19T23:21:37.301 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mon.b.log kind   <kind>Leak_StillReachable</kind>
2017-10-19T23:21:37.302 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.b.log
2017-10-19T23:21:37.302 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mon.c.log kind   <kind>Leak_StillReachable</kind>
2017-10-19T23:21:37.302 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.c.log
2017-10-19T23:21:37.304 INFO:teuthology.orchestra.run.clara012.stdout:/var/log/ceph/valgrind/client.0.log:  <kind>Leak_StillReachable</kind>
2017-10-19T23:21:37.305 INFO:teuthology.orchestra.run.clara012.stdout:/var/log/ceph/valgrind/mds.a.log:  <kind>Leak_DefinitelyLost</kind>
2017-10-19T23:21:37.305 INFO:teuthology.orchestra.run.clara012.stdout:/var/log/ceph/valgrind/mds.a.log:  <kind>Leak_PossiblyLost</kind>
2017-10-19T23:21:37.305 INFO:teuthology.orchestra.run.clara012.stdout:/var/log/ceph/valgrind/mon.a.log:  <kind>Leak_StillReachable</kind>
2017-10-19T23:21:37.305 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/client.0.log kind   <kind>Leak_StillReachable</kind>
2017-10-19T23:21:37.305 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/client.0.log
2017-10-19T23:21:37.305 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mds.a.log kind   <kind>Leak_DefinitelyLost</kind>
2017-10-19T23:21:37.306 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mds.a.log kind   <kind>Leak_PossiblyLost</kind>
2017-10-19T23:21:37.306 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/mon.a.log kind   <kind>Leak_StillReachable</kind>
2017-10-19T23:21:37.306 ERROR:tasks.ceph:saw valgrind issue   <kind>Leak_StillReachable</kind> in /var/log/ceph/valgrind/mon.a.log