Hide Forgot
Hi, with glusterFS you can expand exist replica volumes with number of bricks = number of replica. But how can I do it with hekaFS?
Not a bug. Send questions about HekaFS to gluster-users or ask in #gluster on freenode (irc.freenode.net)
I did use the commands from gluster but then nothing is working anymore. There are no heka commands to do that. I use gluster expand tools and add bricks.
reopen as enhancement request for expand functionality and/or better documentation about how to expand a HekaFS volume.
I think it would be reasonable to consider this a doc bug. The underlying issue here is that there's no change notification between the GlusterFS and HekaFS management pieces. As a result, changes to the underlying GlusterFS volume won't even be noticed by HekaFS until the HekaFS volume is restarted. At that point we'll regenerate the multi-tenant HekaFS config and recalculate the list of daemons to start. cr, can you please verify that restarting the HekaFS volume results in the changed brick list (or other changes) being picked up properly for you? This should at least be documented; the even better news is that the underlying problem should go away as we integrate HekaFS functionality into GlusterFS instead of their being separate.
Hi, if I rebalance it with glusterFS and want to restart the Volume from web interface and got error 500 message if I want to start the volume again. regards Christopher.
That's even more disturbing. I think there might be some interesting interactions between a global rebalance and behavior on the per-tenant volumes (which are built on subdirectories of the GlusterFS bricks), but nothing that should prevent the daemons from starting up. Is there anything in /var/log/hekafs to shed more light on the failure?
Nothing about trying to start the volume via hekaFS web interface. Ok regula this workaround should work? 1. adding bricks on manage server tabs 2. adding bricks with mount point on manage volume tab 3. gluster volume add-brick VOLNAME NEW-BRICK on terminal 4. Both cluster volume info and hfs_list_volumes seems to be fine 5. gluster volume rebalance VOLNAME start did not working because volume is not started via cluster volume start VOLNAME Which workaround did u prefer? I have 2 bricks configured as replicate volume, added 2 new bricks, now I want change volume to replicate-distributed volume with 2 replicate without loosing datas. Did I loosing data if I delete volume and re-create new one?
> Nothing about trying to start the volume via hekaFS web interface. start and stop are on the Volume Management page (tab) for Existing Volumes. > > Ok regula this workaround should work? > > 1. adding bricks on manage server tabs No, you don't add bricks on this page, just nodes. > 2. adding bricks with mount point on manage volume tab Yes. > 3. gluster volume add-brick VOLNAME NEW-BRICK on terminal > 4. Both cluster volume info and hfs_list_volumes seems to be fine > 5. gluster volume rebalance VOLNAME start did not working because volume is not started via cluster volume start VOLNAME Generally speaking you should not use gluster commands on hekafs volumes. > Did I loosing data if I delete volume and re-create new one? No, the files will remain, untouched, on the underlying bricks. They will still be there when you recreate the new volume.
There's no way to add bricks to an existing volume in the HekaFS interface. You can add them via the GlusterFS *if you're very careful* but, as noted reviously, HekaFS won't notice the change until its own volumes are restarted. Without rebalancing, the sequence would look like this. (0) Stop the volume at all levels. (1) gluster volume add-brick MYVOL replica 2 SERVER1:BRICK1 SERVER2:BRICK2 (2) hfs_start_volume MYVOL If you want to rebalance before starting the modified volume, you could do the following. (1.1) gluster volume start MYVOL (1.2) gluster volume rebalance MYVOL start (1.3) *wait* for rebalance to complete (1.4) gluster volume stop MYVOL The last two steps are important, because you should definitely not have the volume started both through "gluster volume start" and hfs_start_volume at the same time. In the not-too-distant future, online expansion of HekaFS volumes will be possible (because by then they'll just be GlusterFS volumes), but doing this requires a level of management integration that has not happened yet. If you delete and recreate the volume, so long as you recreate it with a command that only appends new bricks at the end, there should be no loss of data. original definition: gluster volume create XYZ replica 2 brick1 brick2 OK: gluster volume create XYZ replica 2 brick1 brick2 brick3 brick4 NOT OK: gluster volume create XYZ replica 2 brick1 brick4 brick3 brick2 The last example would result in brick1+brick4 being combined into one replica pair and brick3+brick2 being combined into another, with many files present in both pairs. That will cause DHT, which is trying to distribute each file to only one of its subvolumes, to get very confused and data loss could result.
If I want start the Volume via hfs: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/bottle.py", line 499, in handle return handler(**args) File "/usr/lib/python2.7/site-packages/hekafsd.py", line 119, in start_volume return hfs_start_volume.run_www(vol_name) File "/usr/lib/python2.7/site-packages/hfs_start_volume.py", line 250, in run_www blob = run_common(vol_name) File "/usr/lib/python2.7/site-packages/hfs_start_volume.py", line 243, in run_common url_obj = urllib2.urlopen(url) File "/usr/lib64/python2.7/urllib2.py", line 126, in urlopen return _opener.open(url, data, timeout) File "/usr/lib64/python2.7/urllib2.py", line 400, in open response = meth(req, response) File "/usr/lib64/python2.7/urllib2.py", line 513, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib64/python2.7/urllib2.py", line 438, in error return self._call_chain(*args) File "/usr/lib64/python2.7/urllib2.py", line 372, in _call_chain result = func(*args) File "/usr/lib64/python2.7/urllib2.py", line 521, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) HTTPError: HTTP Error 500: INTERNAL SERVER ERROR
Are you sure hekafsd is running on the new servers? This kind of traceback usually occurs when we fail to contact one of the nodes that's hosting a brick for the volume we're starting. This can be because hekafsd isn't running there (even though glusterd might be), firewall issues, DNS issues, etc. Also, make sure the volume is stopped (from both GlusterFS's and HekaFS's point of view) before trying to start it, by running "ps" to look for "glusterfsd" processes. If you're still getting a server error in that case, it's probably because of something essentially unrelated to expanding the volume, because we're not even getting to the point where the changed volume definition would matter.
Yes, all is fine, iptables is off, ping is ok both IP and DNS. What about the different version? Node 1+2 had: Version : 0.7 Ausgabe : 16.fc16 Node 3+4: hekafs-0.7-18.fc16.x86_64.rpm I created an Temp volume on the 2 new node, no problem.
Nothing changed. I rebuild our cluster local and installed on each bricks the newest hekaFS and glusterFS packages, same error. First build Volume with 2 nodes and one replicate Volume. Second remove Volume build new one with same name and 4 nodes, and get on start volume this error: Error 500: Internal Server Error Sorry, the requested URL http://192.168.1.114:8080/volumes/TMP/start caused an error: Unhandled exception Exception: HTTPError() Traceback: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/bottle.py", line 499, in handle return handler(**args) File "/usr/lib/python2.7/site-packages/hekafsd.py", line 117, in start_volume return hfs_start_volume.run_www(vol_name) File "/usr/lib/python2.7/site-packages/hfs_start_volume.py", line 253, in run_www blob = run_common(vol_name) File "/usr/lib/python2.7/site-packages/hfs_start_volume.py", line 246, in run_common url_obj = urllib2.urlopen(url) File "/usr/lib64/python2.7/urllib2.py", line 126, in urlopen return _opener.open(url, data, timeout) File "/usr/lib64/python2.7/urllib2.py", line 400, in open response = meth(req, response) File "/usr/lib64/python2.7/urllib2.py", line 513, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib64/python2.7/urllib2.py", line 438, in error return self._call_chain(*args) File "/usr/lib64/python2.7/urllib2.py", line 372, in _call_chain result = func(*args) File "/usr/lib64/python2.7/urllib2.py", line 521, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) HTTPError: HTTP Error 500: INTERNAL SERVER ERROR
Did it, I have to remove the Tenants and recreate it.
I'm glad you found a solution. The tenant-related 500 error seems like a separate bug, so I'll clone this one to track it.
After recreating the volume all web server get time out, and nobody know why.
HekaFS will be merged into core Gluster