Created attachment 992325 [details] Patch for problems reported Description of problem: The dspam sqlite3_drv runs in auto_commit mode, resulting in 1000s of transactions on a single email - rendering it unusably slow. Further, it crashes on sql errors (include expected failures such as with _ds_verify_signature) because it uses the system free() on the error message from sqlite3, instead of sqlite3_free(). Version-Release number of selected component (if applicable): dspam-3.10.2-4.el6.x86_64 How reproducible: always Steps to Reproduce: 1. run dspam_train with sqlite3 configured to see performance bug. 2. verify_signature error can be seen eventually in normal operation, but more quickly with the pydspam-1.3 test suite, with Dspam.py modified to load the sqlite3 driver (one line change). Apply the performance bug fix first - or it will be a while... 3. Actual results: Unusably slow, and crashes on verify_signature when signature does not exist. Expected results: Reasonable performance, and no crashes. Additional info: https://pypi.python.org/pypi?%3Aaction=pkg_edit&name=pydspam Patch to fix the problems attached.
dspam-3.10.2-16.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/dspam-3.10.2-16.fc20
dspam-3.10.2-16.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/dspam-3.10.2-16.fc21
dspam-3.10.2-10.el7 has been submitted as an update for Fedora EPEL 7. https://admin.fedoraproject.org/updates/dspam-3.10.2-10.el7
dspam-3.10.2-6.el6 has been submitted as an update for Fedora EPEL 6. https://admin.fedoraproject.org/updates/dspam-3.10.2-6.el6
Package dspam-3.10.2-10.el7: * should fix your issue, * was pushed to the Fedora EPEL 7 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=epel-testing dspam-3.10.2-10.el7' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-EPEL-2015-0884/dspam-3.10.2-10.el7 then log in and leave karma (feedback).
Thanks for the patch. Committed: http://sourceforge.net/p/dspam/code/ci/f53478286089caaa7fb21489550273ff35c6f442/
After doing some more reading on sqlite3 transactions, I believe the BEGIN should be a BEGIN IMMEDIATE. With the plain BEGIN, locks are deferred until the actual query or update - and the COMMIT may fail with SQLITE_BUSY, and need to be retried (with potential deadlock). With BEGIN IMMEDIATE, RESERVED locks (allow others to read, but not write) are acquired on the database immediately. The BEGIN IMMEDIATE may then fail, and should return an error (or does it need to loop?), but the COMMIT should be ok once the BEGIN IMMEDIATE succeeds.
I have not seen this potential problem, and multiprocess test cases are hard to construct. Dspam has its own lock file, which may prevent other dspam processes from tripping up the transaction.
https://www.sqlite.org/lang_transaction.html
However, there is just one table immediately updated (no reads between BEGIN and UPDATE/INSERT). If that update fails to obtain the lock, we error out correctly with a rollback. There can be no deadlock with just one table locked, so it is equivalent to IMMEDIATE MODE in this instance. I'm also noticing that the update uses data fetched from a previous _get_all_spam_records(), that was NOT in the transaction. So correctness depends on the dspam lock file. So this transaction is purely an efficiency issue. With that, I'll quit pontificating until I have an actual test case that breaks it.
So I don't use the sqlite driver so don't have any skin in this game. It looks like they've committed your patch upstream however you don't seem to be sure its the best way forward. If I push these builds to stable this bug will be closed. Can you let me know if you have a revised patch for the transaction mode?
Short answer: I do not have a revised patch Long answer: My reading of the docs is that because there is only one table involved, the existing patch should work. If there is a lock failure, it will fail on the first insert or update - both of which already have a ROLLBACK. Changing the transaction mode to IMMEDIATE would merely cause the failure on the BEGIN instead. So I don't think a revised patch is needed. This is not really a correct transaction anyway, because it is updating with data collected from outside the transaction. It is purely a (necessary) performance hack.
dspam-3.10.2-16.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report.
dspam-3.10.2-16.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.
dspam-3.10.2-10.el7 has been pushed to the Fedora EPEL 7 stable repository. If problems still persist, please make note of it in this bug report.
dspam-3.10.2-6.el6 has been pushed to the Fedora EPEL 6 stable repository. If problems still persist, please make note of it in this bug report.