Bug 595010 - job_server: calling GetJobSummaries on a submission with live jobs causes seg fault
job_server: calling GetJobSummaries on a submission with live jobs causes seg...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor (Show other bugs)
Development
All Linux
high Severity high
: 1.3
: ---
Assigned To: Pete MacKinnon
Tomas Rusnak
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-22 15:57 EDT by Pete MacKinnon
Modified: 2010-07-22 13:08 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-07-22 13:08:56 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Pete MacKinnon 2010-05-22 15:57:38 EDT
Stack dump for process 18818 at timestamp 1274558043 (19 frames)
condor_job_server(dprintf_dump_stack+0xc7)[0x80fc5db]
condor_job_server[0x80fc7b2]
[0x675400]
condor_job_server(_ZN8AttrList8NextExprEv+0x54)[0x8159546]
condor_job_server(_Z15jobToVariantMapPK3JobRSt3mapISsN4qpid5types7VariantESt4lessISsESaISt4pairIKSsS5_EEEPPKc+0xde)[0x80d7403]
condor_job_server(_ZN16SubmissionObject15GetJobSummariesERSt3mapISsN4qpid5types7VariantESt4lessISsESaISt4pairIKSsS3_EEERSs+0x289)[0x80ce1dd]
condor_job_server(_ZN16SubmissionObject16ManagementMethodEjRN4qpid10management4ArgsERSs+0x38)[0x80ce848]
condor_job_server(_ZN3qmf3com6redhat4grid10Submission8doMethodERSsRKSt3mapISsN4qpid5types7VariantESt4lessISsESaISt4pairIKSsS8_EEERSF_+0x5f0)[0x80c69ce]
/usr/lib/libqmf.so.1(_ZN4qpid10management19ManagementAgentImpl19invokeMethodRequestERKSsS3_S3_+0x1057)[0x13fba7]
/usr/lib/libqmf.so.1(_ZN4qpid10management19ManagementAgentImpl13pollCallbacksEj+0xc6)[0x147276]
condor_job_server(_Z16HandleMgmtSocketP7ServiceP6Stream+0x1f)[0x80c8427]
condor_job_server(_ZN10DaemonCore24CallSocketHandler_workerEibP6Stream+0x187)[0x80e26c7]
condor_job_server(_ZN10DaemonCore35CallSocketHandler_worker_demarshallEPv+0x3b)[0x80e252f]
condor_job_server(_ZN13CondorThreads8pool_addEPFvPvES0_PiPKc+0x29)[0x813dd2b]
condor_job_server(_ZN10DaemonCore17CallSocketHandlerERib+0x1bc)[0x80e24f2]
condor_job_server(_ZN10DaemonCore6DriverEv+0x180e)[0x80e2200]
condor_job_server(main+0x1ce0)[0x80f6a87]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7c4a86]
condor_job_server[0x80be061]
Segmentation fault

Something wrong about the parent chaining of classads...
Comment 1 Pete MacKinnon 2010-05-24 17:35:35 EDT
Trying to externalize the cluster-to-job classads was casuing all sorts of problems. Went to a model where LiveJob ctor chains parents classad.
Comment 2 Pete MacKinnon 2010-07-20 11:09:01 EDT
1) ensure condor is setup for QMF plugins
2) QMF_PUBLISH_SUBMISSIONS=True
3) submit job using condor_submit
4) qpid-tool
5) "list com.redhat.grid:submission"
6) choose a corresponding submission in list from step1 (submission name should have cluster id at end of string)
7) "call some_qmf_object_number_from_step_6 GetJobSummaries"

should return a map of job details like cmd, args, etc.

Note You need to log in before you can comment on or make changes to this bug.