Requesting some updates with sos. Currently we have a plugin (rpm.rpmdb) which is off by default and only copies all of /var/lib/rpm into the sosreport. My suggestions for improvement are: 1. When rpm/yum plugin fails or timeout's, automatically grab the rpm database 1b. Decrease the timeout for rpm/yum plugins failing 2. The rpmdb plugin should first attempt to tar the entire database for a more usable format for engineers RHEL 7 and lower # tar cjf rpm_db-$(hostname).tar.bz2 /var/lib/{rpm,yum} RHEL 8 and higher # tar cJhf rpm_db-$(hostname).tar.xz /var/lib/{rpm,dnf} /etc/{dnf*,os-release} - - - - - - - - - If there is any concern over size I would secondarily request we make this generate the tar outside of the sos Example: sosreport > detects rpm issues > generates /var/tmp/rpmdb.tar.xz Or just have it inside the rpm database.
Indeed, the sizes of the /var/lib/rpm directory are quite big, "du -hs /var/lib/rpm" from various Fedoras or RHELs: 216M /var/lib/rpm 182M /var/lib/rpm 273M /var/lib/rpm 277M /var/lib/rpm 223M /var/lib/rpm 41M /var/lib/rpm So I would be also against collecting that by default (I think this is consensus here) and mildly against collecting it after timeouted plugin.
Also with the big size of /var/lib/rpm, I tend to agree with Jake and not collect the directory content after plugin timeout. So we can currently offer: - decreasing rpm plugin timeout to e.g. 1 minute (as the commands are usually very quick and can exceed this timeout only in case of a lock - which prevent commands execution also on the default 5m plugin timeout - so the timeout decrease can speed up things in case of locking problems - add a plugin option - disabled by default - to grab whole directory content without a size limit Or do I miss some idea or option?
(In reply to Pavel Moravec from comment #13) > Also with the big size of /var/lib/rpm, I tend to agree with Jake and not > collect the directory content after plugin timeout. So we can currently > offer: > > - decreasing rpm plugin timeout to e.g. 1 minute (as the commands are > usually very quick and can exceed this timeout only in case of a lock - > which prevent commands execution also on the default 5m plugin timeout - so > the timeout decrease can speed up things in case of locking problems > - add a plugin option - disabled by default - to grab whole directory > content without a size limit > > Or do I miss some idea or option? Hello, please let us know your preferences (or suggest another idea).
The reality today is that if a sosreport doesn't have an operable RPM database, the Support Delivery teams ask for the /var/lib/rpm directory. The raised concern around the size of the directory is valid, and indicates to me that we should not pursue just a blanket gathering of the contents. That being said, the request here is to get the teams the data they need on the initial capture when that invalid state is encountered. I don't see how having a timeout-bound fallback behaviour of gathering the /var/lib/rpm directory is anything less than perfect for this purpose.