Bug 1462162 - [Stress] : Worker crashed with "TypeError: int() argument must be a string or a number, not 'list'"
[Stress] : Worker crashed with "TypeError: int() argument must be a string or...
Status: NEW
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: geo-replication (Show other bugs)
3.3
Unspecified Unspecified
low Severity high
: ---
: ---
Assigned To: Aravinda VK
Rahul Hinduja
: ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-16 06:48 EDT by Ambarish
Modified: 2018-01-13 13:11 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ambarish 2017-06-16 06:48:52 EDT
Description of problem:
-----------------------

4 Node Master
2 Node Slave.

Geo-rep syncing was I.P.

Was running Bonnie,smallfiles,iozone and kernel untar from 8 FUSE mounts.

Worker crashed in the meantime :

<snip>

[2017-06-16 09:51:44.486138] E [syncdutils(/bricks5/A1):296:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 326, in twrap
    tf(*aa)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1575, in syncjob
    po = self.sync_engine(pb, self.log_err)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1710, in rsync
    log_err=log_err)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 56, in sup
    sys._getframe(1).f_code.co_name)(*a, **kw)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1025, in rsync
    "log_rsync_performance", default_value=False))
  File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 264, in get_realtime
    return self.get(opt, printValue=False, default_value=default_value)
  File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 375, in get
    self.update_to(d, allow_unresolved=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 352, in update_to
    for sect in self.ord_sections():
  File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 311, in ord_sections
    so2[k] = int(v)
TypeError: int() argument must be a string or a number, not 'list'
[2017-06-16 09:51:44.573340] I [syncdutils(/bricks5/A1):237:finalize] <top>: exiting.
[2017-06-16 09:51:44.791912] I [repce(/bricks5/A1):92:service_loop] RepceServer: terminating on reaching EOF.
[2017-06-16 09:51:44.792418] I [syncdutils(/bricks5/A1):237:finalize] <top>: exiting.
[2017-06-16 09:51:44.797178] I [gsyncdstatus(monitor):240:set_worker_status] GeorepStatus: Worker Status: Faulty


</snip>


Version-Release number of selected component (if applicable):
-------------------------------------------------------------

glusterfs-3.8.4-28.el7rhgs.x86_64

How reproducible:
-----------------

Reporting the first occurence.


Additional info:
----------------

*MAster* :

 
Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 29f18f45-c822-4c0e-84ef-737e128e0368
Status: Started
Snapshot Count: 0
Number of Bricks: 12 x 2 = 24
Transport-type: tcp
Bricks:
Brick1: gqas005.sbu.lab.eng.bos.redhat.com:/bricks1/A1
Brick2: gqas013.sbu.lab.eng.bos.redhat.com:/bricks1/A1
Brick3: gqas006.sbu.lab.eng.bos.redhat.com:/bricks1/A1
Brick4: gqas008.sbu.lab.eng.bos.redhat.com:/bricks1/A1
Brick5: gqas005.sbu.lab.eng.bos.redhat.com:/bricks2/A1
Brick6: gqas013.sbu.lab.eng.bos.redhat.com:/bricks2/A1
Brick7: gqas006.sbu.lab.eng.bos.redhat.com:/bricks2/A1
Brick8: gqas008.sbu.lab.eng.bos.redhat.com:/bricks2/A1
Brick9: gqas005.sbu.lab.eng.bos.redhat.com:/bricks3/A1
Brick10: gqas013.sbu.lab.eng.bos.redhat.com:/bricks3/A1
Brick11: gqas006.sbu.lab.eng.bos.redhat.com:/bricks3/A1
Brick12: gqas008.sbu.lab.eng.bos.redhat.com:/bricks3/A1
Brick13: gqas005.sbu.lab.eng.bos.redhat.com:/bricks4/A1
Brick14: gqas013.sbu.lab.eng.bos.redhat.com:/bricks4/A1
Brick15: gqas006.sbu.lab.eng.bos.redhat.com:/bricks4/A1
Brick16: gqas008.sbu.lab.eng.bos.redhat.com:/bricks4/A1
Brick17: gqas005.sbu.lab.eng.bos.redhat.com:/bricks5/A1
Brick18: gqas013.sbu.lab.eng.bos.redhat.com:/bricks5/A1
Brick19: gqas006.sbu.lab.eng.bos.redhat.com:/bricks5/A1
Brick20: gqas008.sbu.lab.eng.bos.redhat.com:/bricks5/A1
Brick21: gqas005.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Brick22: gqas013.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Brick23: gqas006.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Brick24: gqas008.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Options Reconfigured:
nfs.disable: off
transport.address-family: inet
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
network.inode-lru-limit: 50000
server.event-threads: 4
client.event-threads: 4
geo-replication.indexing: on
geo-replication.ignore-pid-check: on
changelog.changelog: on
cluster.enable-shared-storage: enable
[root@gqas013 glusterfs]# 

*Slave* :

[root@gqas015 ~]# gluster v info
 
Volume Name: butcher
Type: Distributed-Disperse
Volume ID: 6de155ee-2200-44bb-a8ed-bdae5acf348f
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x (4 + 2) = 24
Transport-type: tcp
Bricks:
Brick1: gqas014.sbu.lab.eng.bos.redhat.com:/bricks1/A1
Brick2: gqas015.sbu.lab.eng.bos.redhat.com:/bricks1/A1
Brick3: gqas014.sbu.lab.eng.bos.redhat.com:/bricks2/A1
Brick4: gqas015.sbu.lab.eng.bos.redhat.com:/bricks2/A1
Brick5: gqas014.sbu.lab.eng.bos.redhat.com:/bricks3/A1
Brick6: gqas015.sbu.lab.eng.bos.redhat.com:/bricks3/A1
Brick7: gqas014.sbu.lab.eng.bos.redhat.com:/bricks4/A1
Brick8: gqas015.sbu.lab.eng.bos.redhat.com:/bricks4/A1
Brick9: gqas014.sbu.lab.eng.bos.redhat.com:/bricks5/A1
Brick10: gqas015.sbu.lab.eng.bos.redhat.com:/bricks5/A1
Brick11: gqas014.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Brick12: gqas015.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Brick13: gqas014.sbu.lab.eng.bos.redhat.com:/bricks7/A1
Brick14: gqas015.sbu.lab.eng.bos.redhat.com:/bricks7/A1
Brick15: gqas014.sbu.lab.eng.bos.redhat.com:/bricks8/A1
Brick16: gqas015.sbu.lab.eng.bos.redhat.com:/bricks8/A1
Brick17: gqas014.sbu.lab.eng.bos.redhat.com:/bricks9/A1
Brick18: gqas015.sbu.lab.eng.bos.redhat.com:/bricks9/A1
Brick19: gqas014.sbu.lab.eng.bos.redhat.com:/bricks10/A1
Brick20: gqas015.sbu.lab.eng.bos.redhat.com:/bricks10/A1
Brick21: gqas014.sbu.lab.eng.bos.redhat.com:/bricks11/A1
Brick22: gqas015.sbu.lab.eng.bos.redhat.com:/bricks11/A1
Brick23: gqas014.sbu.lab.eng.bos.redhat.com:/bricks12/A1
Brick24: gqas015.sbu.lab.eng.bos.redhat.com:/bricks12/A1
Options Reconfigured:
network.inode-lru-limit: 50000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
transport.address-family: inet
nfs.disable: off
[root@gqas015 ~]# 



[root@gqas013 glusterfs]# gluster volume geo-replication testvol gqas014.sbu.lab.eng.bos.redhat.com::butcher  status  detail
 
MASTER NODE                           MASTER VOL    MASTER BRICK    SLAVE USER    SLAVE                                          SLAVE NODE                            STATUS     CRAWL STATUS    LAST_SYNCED    ENTRY    DATA    META    FAILURES    CHECKPOINT TIME        CHECKPOINT COMPLETED    CHECKPOINT COMPLETION TIME   
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
gqas013.sbu.lab.eng.bos.redhat.com    testvol       /bricks1/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas014.sbu.lab.eng.bos.redhat.com    Active     Hybrid Crawl    N/A            0        7299    0       0           2017-06-16 05:32:22    No                      N/A                          
gqas013.sbu.lab.eng.bos.redhat.com    testvol       /bricks2/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas014.sbu.lab.eng.bos.redhat.com    Active     Hybrid Crawl    N/A            0        7321    0       0           2017-06-16 05:32:22    No                      N/A                          
gqas013.sbu.lab.eng.bos.redhat.com    testvol       /bricks3/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas014.sbu.lab.eng.bos.redhat.com    Active     Hybrid Crawl    N/A            0        7303    0       0           2017-06-16 05:32:22    No                      N/A                          
gqas013.sbu.lab.eng.bos.redhat.com    testvol       /bricks4/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas014.sbu.lab.eng.bos.redhat.com    Active     Hybrid Crawl    N/A            8192     7318    0       0           2017-06-16 05:32:22    No                      N/A                          
gqas013.sbu.lab.eng.bos.redhat.com    testvol       /bricks5/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas014.sbu.lab.eng.bos.redhat.com    Active     Hybrid Crawl    N/A            0        7308    0       0           2017-06-16 05:32:22    No                      N/A                          
gqas013.sbu.lab.eng.bos.redhat.com    testvol       /bricks6/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas014.sbu.lab.eng.bos.redhat.com    Active     Hybrid Crawl    N/A            0        7293    0       0           2017-06-16 05:32:22    No                      N/A                          
gqas005.sbu.lab.eng.bos.redhat.com    testvol       /bricks1/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas015.sbu.lab.eng.bos.redhat.com    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
gqas005.sbu.lab.eng.bos.redhat.com    testvol       /bricks2/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas015.sbu.lab.eng.bos.redhat.com    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
gqas005.sbu.lab.eng.bos.redhat.com    testvol       /bricks3/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas015.sbu.lab.eng.bos.redhat.com    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
gqas005.sbu.lab.eng.bos.redhat.com    testvol       /bricks4/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas015.sbu.lab.eng.bos.redhat.com    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
gqas005.sbu.lab.eng.bos.redhat.com    testvol       /bricks5/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas015.sbu.lab.eng.bos.redhat.com    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
gqas005.sbu.lab.eng.bos.redhat.com    testvol       /bricks6/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas015.sbu.lab.eng.bos.redhat.com    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
gqas008.sbu.lab.eng.bos.redhat.com    testvol       /bricks1/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas014.sbu.lab.eng.bos.redhat.com    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
gqas008.sbu.lab.eng.bos.redhat.com    testvol       /bricks2/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas014.sbu.lab.eng.bos.redhat.com    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
gqas008.sbu.lab.eng.bos.redhat.com    testvol       /bricks3/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas014.sbu.lab.eng.bos.redhat.com    Active     Hybrid Crawl    N/A            0        7310    0       0           2017-06-16 05:32:22    No                      N/A                          
gqas008.sbu.lab.eng.bos.redhat.com    testvol       /bricks4/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas014.sbu.lab.eng.bos.redhat.com    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
gqas008.sbu.lab.eng.bos.redhat.com    testvol       /bricks5/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas014.sbu.lab.eng.bos.redhat.com    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
gqas008.sbu.lab.eng.bos.redhat.com    testvol       /bricks6/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas014.sbu.lab.eng.bos.redhat.com    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
gqas006.sbu.lab.eng.bos.redhat.com    testvol       /bricks1/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas015.sbu.lab.eng.bos.redhat.com    Active     Hybrid Crawl    N/A            0        7305    0       0           2017-06-16 05:32:22    No                      N/A                          
gqas006.sbu.lab.eng.bos.redhat.com    testvol       /bricks2/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas015.sbu.lab.eng.bos.redhat.com    Active     Hybrid Crawl    N/A            8192     7311    0       0           2017-06-16 05:32:22    No                      N/A                          
gqas006.sbu.lab.eng.bos.redhat.com    testvol       /bricks3/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas015.sbu.lab.eng.bos.redhat.com    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
gqas006.sbu.lab.eng.bos.redhat.com    testvol       /bricks4/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas015.sbu.lab.eng.bos.redhat.com    Active     Hybrid Crawl    N/A            0        7311    0       0           2017-06-16 05:32:22    No                      N/A                          
gqas006.sbu.lab.eng.bos.redhat.com    testvol       /bricks5/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas015.sbu.lab.eng.bos.redhat.com    Active     Hybrid Crawl    N/A            0        7309    0       0           2017-06-16 05:32:22    No                      N/A                          
gqas006.sbu.lab.eng.bos.redhat.com    testvol       /bricks6/A1     root          gqas014.sbu.lab.eng.bos.redhat.com::butcher    gqas015.sbu.lab.eng.bos.redhat.com    Active     Hybrid Crawl    N/A            0        7311    0       0           2017-06-16 05:32:22    No                      N/A                          
[root@gqas013 glusterfs]#
Comment 2 Aravinda VK 2017-06-16 07:44:15 EDT
Please share the sos reports.
Comment 6 Rahul Hinduja 2017-07-27 14:35:58 EDT
Have seen this issue once more on build: glusterfs-geo-replication-3.8.4-35.el7rhgs.x86_64

It was found during the automation sanity check. Worker crashes and restarts. Data sync has no side-effect. 

[2017-07-27 17:35:49.559218] E [syncdutils(/bricks/brick1/master_brick9):296:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 326, in twrap
    tf(*aa)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1582, in syncjob
    po = self.sync_engine(pb, self.log_err)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1710, in rsync
    log_err=log_err)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 56, in sup
    sys._getframe(1).f_code.co_name)(*a, **kw)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1025, in rsync
    "log_rsync_performance", default_value=False))
  File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 264, in get_realtime
    return self.get(opt, printValue=False, default_value=default_value)
  File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 375, in get
    self.update_to(d, allow_unresolved=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 352, in update_to
    for sect in self.ord_sections():
  File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 311, in ord_sections
    so2[k] = int(v)
TypeError: int() argument must be a string or a number, not 'list'
[2017-07-27 17:35:49.600589] I [syncdutils(/bricks/brick1/master_brick9):237:finalize] <top>: exiting.
Comment 7 Rochelle 2017-08-11 01:29:23 EDT
Have hit this issue on build : glusterfs-geo-replication-3.8.4-39.el7rhgs.x86_64

Steps to Reproduce:
1. Create Master Volume and generate data
2. Create Slave Volume and establish Geo-rep session and start(Starts with Hybrid Crawl for initial sync)
3. Once started wait for 5-10 seconds, do an rm -rf on the master mount.


[2017-08-11 04:24:53.86406] E [syncdutils(/rhs/brick1/b2):296:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 326, in twrap
    tf(*aa)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1583, in syncjob
    po = self.sync_engine(pb, self.log_err)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1718, in rsync
    log_err=log_err)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 56, in sup
    sys._getframe(1).f_code.co_name)(*a, **kw)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1033, in rsync
    "log_rsync_performance", default_value=False))
  File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 264, in get_realtime
    return self.get(opt, printValue=False, default_value=default_value)
  File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 375, in get
    self.update_to(d, allow_unresolved=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 352, in update_to
    for sect in self.ord_sections():
  File "/usr/libexec/glusterfs/python/syncdaemon/configinterface.py", line 311, in ord_sections
    so2[k] = int(v)
TypeError: int() argument must be a string or a number, not 'list'
[2017-08-11 04:24:53.125792] I [syncdutils(/rhs/brick1/b2):237:finalize] <top>: exiting.
Comment 8 Kotresh HR 2017-10-17 08:57:17 EDT
Taking this out of 3.4 as it's less priority and needs more development time
to fix it.

Note You need to log in before you can comment on or make changes to this bug.