Description of problem: on sunday I updated to the 9.1.3-2 with brotli support, two days later, after adding a new site, this happend: the cache got inconsistent in itself : [root@s113 trafficserver]# ls -ls /var/cache/trafficserver/cache.db 2095520 -rw-r--r-- 1 trafficserver trafficserver 10737418240 13. Sep 11:27 /var/cache/trafficserver/cache.db Crashes: [root@s113 trafficserver]# traffic_cache_tool list 0 [22]: A span file specified by --spans is required [root@s113 trafficserver]# traffic_cache_tool list --spans /var/cache/trafficserver/cache.db terminate called after throwing an instance of 'std::length_error' what(): basic_string::_M_replace_aux Abgebrochen (Speicherabzug geschrieben) [root@s113 trafficserver]# traffic_cache_tool dir_check --spans /var/cache/trafficserver/cache.db full terminate called after throwing an instance of 'std::length_error' what(): basic_string::_M_replace_aux Abgebrochen (Speicherabzug geschrieben) Parts of the cache are intact, as they still get served, other parts not. Version-Release number of selected component (if applicable): Name : trafficserver Version : 9.1.3 Release : 2.fc35 Architecture: x86_64 Install Date: So 11 Sep 2022 13:33:33 CEST
Update: i removed the cache.db the hard way, as no traffic_cache_tool options for clearing the cache, did something use/successfull with the newly created cache, Trafficserver started normally, is functional, but the tool does not work at all: [root@s113 trafficserver]# rm -f /var/cache/trafficserver/cache.db [root@s113 trafficserver]# systemctl start trafficserver.service [root@s113 trafficserver]# traffic_cache_tool --spans /var/cache/trafficserver/cache.db scan terminate called after throwing an instance of 'std::length_error' what(): basic_string::_M_replace_aux Abgebrochen (Speicherabzug geschrieben)
Are you able to reproduce this problem with upstream? I suspect this is an upstream rather than packaging bug -- I can open one on GitHub and copy you if you'd like.
This bug is caused by the misleading assumption, that it's ok to supply the cache file itself, and not the cache directory it is placed in. A good code would make a sanity check i.e. check the filesize before trying to allocate 10G of ram. I would count the segfault as bug, but it's up to you Jered, if you want to open a bug at upstream. For anyone having the "same issue" , here is how to do it correctly: # traffic_cache_tool list --spans /etc/trafficserver/storage.config 0 [1]: Directory support not yet available in my case synonyms to ... # traffic_cache_tool list --spans /var/cache/trafficserver/ 0 [1]: Directory support not yet available "Correct" does not necessarily mean, it will do anything ;) They have not been able to implement this in 10 major versions of ats.. such a shame :(
I believe what's happening here is that loadSpanConfig() is trying to parse your cache.db as a config file, and hitting an assertion in doing so. (I don't see this behavior but get a different nonsensical error.) As you point out, the underlying issue is that traffic_cache_tool is not supported on directory caches. I'll add a note to this existing open issue, but I see this as an RFE rather than a bug unfortunately: https://github.com/apache/trafficserver/issues/5168