Bug 1761366
| Summary: | libgfapi: the glfs_init() get stuck and is in inifinitely loop in pthread_spin_lock() | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Xiubo Li <xiubli> |
| Component: | libgfapi | Assignee: | bugs <bugs> |
| Status: | CLOSED DUPLICATE | QA Contact: | bugs <bugs> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7 | CC: | bugs |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-10-14 09:22:46 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
*** This bug has been marked as a duplicate of bug 1761365 *** |
Description of problem: I am now testing the gfapi stuff based on the gluster-block/tcmu-runner, and hit one problem that the tcmu-runner process is running in almost 100% cpu and get stuck when creating the gluster-block device: The gluster-block command is: [root@localhost tcmu-runner]# gluster-block create repvol/block ha 2 prealloc full 10.70.39.238,10.70.39.231 1G [root@localhost tcmu-runner]# top top - 14:14:50 up 1:07, 2 users, load average: 2.06, 1.89, 1.17 Tasks: 116 total, 2 running, 114 sleeping, 0 stopped, 0 zombie %Cpu(s): 50.0 us, 3.1 sy, 0.0 ni, 46.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 1990.4 total, 853.9 free, 270.9 used, 865.6 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 1560.8 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8020 root 20 0 2412664 35916 21792 R 93.8 1.8 12:39.00 tcmu-runner 1 root 20 0 108892 15540 9472 S 0.0 0.8 0:02.44 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd 3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp [root@localhost tcmu-runner]# perf top -p 8020 Samples: 7K of event 'cpu-clock:pppH', 4000 Hz, Event count (approx.): 1838750000 lost: 0/0 drop: 0/0 Overhead Shared Object Symbol 99.95% libpthread-2.29.so [.] pthread_spin_lock 0.01% [kernel] [k] __ip_queue_xmit 0.01% [kernel] [k] __softirqentry_text_start 0.01% [kernel] [k] _raw_spin_unlock_irqrestore 0.01% [kernel] [k] run_rebalance_domains [root@localhost tcmu-runner]# pstack 8020 Thread 17 (Thread 0x7f709a7fc700 (LWP 11351)): ... Thread 1 (Thread 0x7f7128527880 (LWP 8020)): #0 0x00007f7128d4f2b5 in pthread_spin_lock () at /lib64/libpthread.so.0 #1 0x00007f7126ba6eba in mem_get () at /lib64/libglusterfs.so.0 #2 0x00007f7126ba6fdd in mem_get0 () at /lib64/libglusterfs.so.0 #3 0x00007f7126b6e004 in get_new_dict_full () at /lib64/libglusterfs.so.0 #4 0x00007f7126b6f9f0 in dict_new () at /lib64/libglusterfs.so.0 #5 0x00007f7126ce9e38 in glfs_init_common () at /lib64/libgfapi.so.0 #6 0x00007f7126cea030 in glfs_init () at /lib64/libgfapi.so.0 #7 0x00007f7126d20332 in tcmu_glfs_unlock () at /usr/lib64/tcmu-runner/handler_glfs.so [root@localhost tcmu-runner]# It is infinitely looping in the libglusterfs.so ....... Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. install gluterfs-7.0 packages from https://download.gluster.org/pub/gluster/glusterfs/qa-releases/7.0rc3/Fedora/fedora-30/x86_64/ and the tcmu-runner/gluster-block from source. 2. enable and start glusterd/tcmu-runner/gluster-blockd services. 3. create one replicate volume: # gluster vol create repvol replica 2 10.70.39.238:/data/repvol 10.70.39.231:/data/repvol force # gluster vol set repvol group gluster-block # gluster vol start repvol # gluster volume set repvol locks.mandatory-locking forced #gluster volume set repvol enforce-mandatory-lock on #gluster volume set repvol performance.client-io-threads off 4. then create the gluster-block device by using: # gluster-block create repvol/block ha 2 prealloc full 10.70.39.238,10.70.39.231 1G 5. it will be stuck in Step4. Actual results: Expected results: Additional info: