Bug 2055504
Summary: | Set gutenprint53+usb backend to use the default USB context | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Matt Fagnani <matt.fagnani> |
Component: | gutenprint | Assignee: | Zdenek Dohnal <zdohnal> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 35 | CC: | bberg, hdegoede, jnovy, jridky, jv+fedora, lucilanga, pizza, plroskin, rhbugs, rkrawitz, twaugh, umar, victortoso, zdohnal |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | gutenprint-5.3.4-8.fc36 gutenprint-5.3.4-8.fc34 gutenprint-5.3.4-8.fc35 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-04-05 00:15:55 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Matt Fagnani
2022-02-17 06:27:02 UTC
Hi Matt, thank you for reporting the issue! Yes, that 'mutex' looks suspicious - it is taken from libusb library from its default context, so the further investigation can happen there. I'll reassign the bug to libusb and it would be great if you filed a separate one for your printing issues (for hplip) and put there data from the following steps https://docs.fedoraproject.org/en-US/quick-docs/how-to-debug-printing-problems/#_i_have_hp_printer_installed_it_with_hplip_and_have_a_problem_with_it . Thank you in advance! (In reply to Zdenek Dohnal from comment #1) > Hi Matt, > > thank you for reporting the issue! > > Yes, that 'mutex' looks suspicious - it is taken from libusb library from > its default context, so the further investigation can happen there. > > I'll reassign the bug to libusb and it would be great if you filed a > separate one for your printing issues (for hplip) and put there data from > the following steps > https://docs.fedoraproject.org/en-US/quick-docs/how-to-debug-printing- > problems/ > #_i_have_hp_printer_installed_it_with_hplip_and_have_a_problem_with_it . > > Thank you in advance! OK that makes sense. I've reported the hp printer errors against hplip with the cups debug logging enabled journal attached at https://bugzilla.redhat.com/show_bug.cgi?id=2055871 Thanks. Please downgrade libusb1 to version 1.0.24 for now. 1.0.25 had a regression, and my attempt to fix it made things even worse. Upstream PR for this issue is https://github.com/libusb/libusb/pull/1073 (In reply to Benjamin Berg from comment #3) > Please downgrade libusb1 to version 1.0.24 for now. 1.0.25 had a regression, > and my attempt to fix it made things even worse. > > Upstream PR for this issue is https://github.com/libusb/libusb/pull/1073 I downgraded to libusb1-1.0.24-4.fc35 and rebooted. I started the printer. There weren't any of the errors or gutenprint53+usb crashes with libusb1-1.0.24-4.fc35. I printed a PDF normally with okular. Thanks. Having the exact same problem with HP Deskjet 3050A J611 printer on usb port. Spent an hour trying to figure it out. I was getting the same messages. Downgrading to libusb1 1.0.24-4 also resolved the problem. The updated 1073 patch solved the cups/hp problem for me but gutenprint still crashes. Hrm, the mutex thing is a NULL pointer, i.e. no default USB context. That smells like it is caused by https://github.com/libusb/libusb/commit/32a22069428cda9d63aa666e92fb8882a83d4515 i.e. that gutenprint is first calling libusb_init with an out pointer, but then relying on the default USB context being set and calling libusb_get_device_list with a NULL pointer rather than the context. *** Bug 2055871 has been marked as a duplicate of this bug. *** OK, while libusb1 is the reason for the regression, this *is* a gutenprint bug. The below patch should fix this (untested). Assigning back to gutenprint. But, I am happy to add a conflicts to any new upload I do for libusb1. diff --git a/cups/backend_common.c b/cups/backend_common.c index 6333408..95708ee 100644 --- a/cups/backend_common.c +++ b/cups/backend_common.c @@ -1008,9 +1008,8 @@ along with this program; if not, see <https://www.gnu.org/licenses/>.\n\n"; fprintf(logger, "%s", license); } -void print_help(const char *argv0, const struct dyesub_backend *backend) +void print_help(const char *argv0, const struct dyesub_backend *backend, struct libusb_context *ctx) { - struct libusb_context *ctx = NULL; struct libusb_device **list = NULL; const char *ptr = getenv("BACKEND"); @@ -1423,7 +1422,7 @@ int main (int argc, char **argv) /* If we don't have a valid backend, print help and terminate */ if (!backend && !stats_only) { - print_help(argv[0], NULL); // probes all devices + print_help(argv[0], NULL, ctx); // probes all devices ret = CUPS_BACKEND_OK; goto done; } @@ -1431,7 +1430,7 @@ int main (int argc, char **argv) /* If we're in standalone mode, print help only if no args */ if ((!uri || !strlen(uri)) && !stats_only) { if (argc < 2) { - print_help(argv[0], backend); // probes all devices + print_help(argv[0], backend, ctx); // probes all devices ret = CUPS_BACKEND_OK; goto done; } (In reply to Benjamin Berg from comment #7) > Hrm, the mutex thing is a NULL pointer, i.e. no default USB context. > According `usbi_get_context()` [1] you get the default USB context if you pass NULL pointer. Then the context is saved into `ctx` [2]. The broken mutex is acquired by derefencing 'ctx' got from `usbi_get_context()` which is passed into `usbi_mutex_lock()` [3] which causes the segfault. So IMHO the cause of crash is a corrupted mutex from default USB context. However, if I move libusb_init() under the scopes with 'print_help()' (to ensure we don't get a local context, but a default one in `libusb_get_device_list`), the crash happens either way. Seems like getting a default USB context is no longer available... is this intentional? > That smells like it is caused by > https://github.com/libusb/libusb/commit/ > 32a22069428cda9d63aa666e92fb8882a83d4515 > > i.e. that gutenprint is first calling libusb_init with an out pointer, but > then relying on the default USB context being set and calling > libusb_get_device_list with a NULL pointer rather than the context. That sounds reasonable, but it would be great if libusb rejects requests with ctx==NULL as argument if the default context is no longer accessible. The previous behavior seemed to be that both contexts - default and libusb_init() generated - worked fine. [1] https://github.com/libusb/libusb/blob/4622bfcf44db373c53502e3fe873bd611e8332f6/libusb/libusbi.h#L444 [2] https://github.com/libusb/libusb/blob/4622bfcf44db373c53502e3fe873bd611e8332f6/libusb/core.c#L828 [3] https://github.com/libusb/libusb/blob/4622bfcf44db373c53502e3fe873bd611e8332f6/libusb/core.c#L837 I can update the gutenprint code and send it upstream, but it would be great if we knew whether default USB context should work and libusb was fixed appropriately. Either way, I'm able to reproduce the crash if I install the broken libusb1 version from bodhi - libusb1-1.0.24-4.fc35.x86_64 works fine. > According `usbi_get_context()` [1] you get the default USB context if you pass NULL pointer. Then the context is saved into `ctx` [2]. The broken mutex is acquired by derefencing 'ctx' got from `usbi_get_context()` which is passed into `usbi_mutex_lock()` [3] which causes the segfault. Well, the default context is NULL. Then, the pointer (0x28) we are seeing is just the offset into the struct, i.e.: #1 0x00007f4deb024c59 in usbi_mutex_lock (mutex=0x28) at os/threads_posix.h:46 > However, if I move libusb_init() under the scopes with 'print_help()' (to ensure we don't get a local context, but a default one in `libusb_get_device_list`), the crash happens either way. Seems like getting a default USB context is no longer available... is this intentional? > > I can update the gutenprint code and send it upstream, but it would be great if we knew whether default USB context should work and libusb was fixed appropriately. So, the default USB context is only created/available after calling libusb_init(NULL), and I don't think that happens in the codebase. libusb does not create this default context implicitly! You could use the default context everywhere by calling libusb_init(NULL)/libusb_exit(NULL). The default context is reference counted, so it is safe to call libusb_init(NULL) multiple times (it should just be balanced with libusb_exit(NULL) calls. What exactly did you try to do code wise? Maybe it would be easier to just rely on the libusb default context and call libusb_init/libusb_exit with a NULL argument? (In reply to Benjamin Berg from comment #12) > Well, the default context is NULL. Then, the pointer (0x28) we are seeing is > just the offset into the struct, i.e.: > > #1 0x00007f4deb024c59 in usbi_mutex_lock (mutex=0x28) at > os/threads_posix.h:46 > Ok, my bad, I've read your comment as 'not default USB context', not 'no default USB context', which implied to me the NULL pointer is not from default context. > > However, if I move libusb_init() under the scopes with 'print_help()' (to ensure we don't get a local context, but a default one in `libusb_get_device_list`), the crash happens either way. Seems like getting a default USB context is no longer available... is this intentional? > > > > I can update the gutenprint code and send it upstream, but it would be great if we knew whether default USB context should work and libusb was fixed appropriately. > > So, the default USB context is only created/available after calling > libusb_init(NULL), and I don't think that happens in the codebase. libusb > does not create this default context implicitly! Seems like it did in 1.0.24, since the same gutenprint codebase works there. Was that a bug in 1.0.24? > > You could use the default context everywhere by calling > libusb_init(NULL)/libusb_exit(NULL). The default context is reference > counted, so it is safe to call libusb_init(NULL) multiple times (it should > just be balanced with libusb_exit(NULL) calls. > > What exactly did you try to do code wise? Maybe it would be easier to just > rely on the libusb default context and call libusb_init/libusb_exit with a > NULL argument? I have a brief idea - ctx in 'print_help()' is needed only for device discovery for printing available devices on stdout, ctx in 'main()' does discovery and communication with the devices (opening, sending, receiving, closing...). 'print_help()' is actually used in global macro and its arguments are globals there, so ctx would have to be a global there as well, so I'm thinking whether 'libusb_init(NULL)'/'libusb_exit(NULL)' can be added into 'print_help()' only. For the gutenprint patch....there are two more fixes for print_help in backend_common.h. With the updated 1073 patch for libusb1 and the corrected gutenprint patch things seem to be ok. > Seems like it did in 1.0.24, since the same gutenprint codebase works there. Was that a bug in 1.0.24? Kind of. libusb 1.0.24 would set the default context even when initializing a specific one. i.e. backend-common.c:1416 /* Libusb setup */ ret = libusb_init(&ctx); would result in ctx to also be the default libusb context. And, then, the libusb_get_device_list in print_help would just use that default ctx (it always passes NULL there). To be honest, I think just using a NULL context everywhere is sane, i.e. something like the below patch. There are a number of possible and valid fixes. Are you able to push an update out before the F36 beta freeze (starts tomorrow, Tue 2022-02-22, 14:00 UTC). In that case, I could push out a new libusb 1.0.25 with the updated upstream patch, and hopefully all this is fixed. If not, I'll probably need to downgrade libusb again. diff --git a/cups/backend_common.c b/cups/backend_common.c index 6333408..b19668d 100644 --- a/cups/backend_common.c +++ b/cups/backend_common.c @@ -753,8 +753,7 @@ static struct dyesub_backend *backends[] = { NULL, }; -static int find_and_enumerate(struct libusb_context *ctx, - struct libusb_device ***list, +static int find_and_enumerate(struct libusb_device ***list, const struct dyesub_backend *backend, const char *match_serno, const char *make, @@ -777,7 +776,7 @@ static int find_and_enumerate(struct libusb_context *ctx, STATE("+org.gutenprint.searching-for-device\n"); /* Enumerate and find suitable device */ - num = libusb_get_device_list(ctx, list); + num = libusb_get_device_list(NULL, list); /* See if we can actually match on the supplied make! */ if (backend && make) { @@ -1010,7 +1009,6 @@ along with this program; if not, see <https://www.gnu.org/licenses/>.\n\n"; void print_help(const char *argv0, const struct dyesub_backend *backend) { - struct libusb_context *ctx = NULL; struct libusb_device **list = NULL; const char *ptr = getenv("BACKEND"); @@ -1072,7 +1070,7 @@ void print_help(const char *argv0, const struct dyesub_backend *backend) } /* Scan for all printers for the specified backend */ - find_and_enumerate(ctx, &list, backend, NULL, ptr, 1, 1, NULL); + find_and_enumerate(&list, backend, NULL, ptr, 1, 1, NULL); libusb_free_device_list(list, 1); } @@ -1251,7 +1249,6 @@ done: int main (int argc, char **argv) { - struct libusb_context *ctx = NULL; struct libusb_device **list = NULL; struct dyesub_backend *backend = NULL; @@ -1414,7 +1411,7 @@ int main (int argc, char **argv) #endif /* Libusb setup */ - ret = libusb_init(&ctx); + ret = libusb_init(NULL); if (ret) { ERROR("Failed to initialize libusb (%d)\n", ret); ret = CUPS_BACKEND_RETRY_CURRENT; @@ -1438,7 +1435,7 @@ int main (int argc, char **argv) } /* Enumerate devices */ - found = find_and_enumerate(ctx, &list, backend, use_serno, backend_str, 0, NUM_CLAIM_ATTEMPTS, &conn); + found = find_and_enumerate(&list, backend, use_serno, backend_str, 0, NUM_CLAIM_ATTEMPTS, &conn); if (found == -1) { ERROR("Printer open failure (No matching printers found!)\n"); @@ -1572,7 +1569,7 @@ done: if (list) libusb_free_device_list(list, 1); - libusb_exit(ctx); + libusb_exit(NULL); return ret; } (In reply to Benjamin Berg from comment #16) > Are you able to push an update out before the F36 beta freeze (starts > tomorrow, Tue 2022-02-22, 14:00 UTC). In that case, I could push out a new > libusb 1.0.25 with the updated upstream patch, and hopefully all this is > fixed. If not, I'll probably need to downgrade libusb again. > I'm not gutenprint upstream, so I would like to discuss it with upstream before pushing it into Fedora. Robert, one of the people working in gutenprint upstream, is a Red Hatter, which can speed up the things. I'll send the patch to upstream via email as well. So I would go with 'I'm not going to make it till freeze' for now. Sent email to gimp-print-devel.net. Robert, would you mind helping us here? It would be great if you could review the patch as Gutenprint upstream - it would be good if we managed to fix it before F35 beta freeze - Tue 2022-02-22, 14:00 UTC . OK, I'll have a look into writing a patch that effectively reverts parts of upstream commit https://github.com/libusb/libusb/commit/32a22069428cda9d63aa666e92fb8882a83d4515 for F36/F37. That said, I would prefer fixing gutenprint obviously, that patch is quite simple and then we may be able to catch similar regressions elsewhere in beta. OK, https://bodhi.fedoraproject.org/updates/FEDORA-2022-b445cd59eb should get things working again. It should log a warning about the API misuse though. Zdenek, let me take a look, but this isn't my code so I'm going to let the maintainer of that area handle it. That said, the maintainer of this code is very responsive and I expect that he'll look at it promptly. I installed libusb1-1.0.25-8.fc36.x86_64 on Fedora 35 and rebooted. There are no signs of gutenprint crash in the kernel log. Two USB printers work normally. They were not working with earlier 1.0.25 builds, and gutenprint was crashing. I can confirm that on FC35 too. I rebuilt the rpm from source. Per Gutenprint maintainer for this component: On Tue, Feb 22, 2022 at 07:55:18AM +0100, Zdenek Dohnal wrote: > there were changes in libusb 1.0.25 which is in Fedora 36 - f.e. default USB > context is not initialized unless you specifically pass NULL into > `libusb_init()` - gutenprint53+usb backend actually depended on this hidden > behavior, because it passes its own context address into `libusb_init()` > instead of NULL. So the change causes the backend to crash if you run it: Yikes! > According libusb maintainer, Ben Berg, the backend can work with the default > USB context and he prepared the patch for it (the file is attached). I was > able to verify the patch fixes the crash, but I don't have a device > supported by gutenprint itself, so I couldn't do an additional sanity > testing. The patch looks good to me though. Will this work for multi-threaded stuff where there could conceivably be multiple contexts in use? That's not being exercised in the current gutenprint code but some of the stuff I have cooking will change that. > Would you mind adding the patch to the project if it looks good for you? I won't be able to physically test any of this until this coming weekend (out of town and while I have remote access everything pertinent is turned off) but when I'm back, I'll be able to at least it doesn't introduce any regressions on F34 & F35. Thanks for letting us know, Pavel and Sammy! So the temporary revert in libusb works, but I will leave the bug opened until there is a fix for gutenprint upstream. Thank you for looking into it, Robert! I've invited Solomon to join us here as well, once he is able to. Ben, according the email Solomon plans to introduce multi-threading into gutenprint backend - would the default USB context work or should we use different libusb context there? About multi-threading. libusb itself is thread-safe, so, it depends on what exactly the needs are. Having separate contexts means that you can choose to only process events on one of them at a time. This avoids any issue where one thread is suddenly processing events for another (i.e. it is undefined which thread executes the callbacks for asynchronous transfers). But, e.g. just doing an enumeration from a different thread using the same context should be completely fine. I didn't have time to properly look into this until today (in the middle of a long trip with very crappy internet access at the moment) but the proposed patch look good. I had to modify it for application to the git master code, but everything should be good now. FEDORA-2022-59c07c5ccf has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2022-59c07c5ccf FEDORA-2022-b1570f0952 has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2022-b1570f0952 FEDORA-2022-22af30e355 has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2022-22af30e355 FEDORA-2022-59c07c5ccf has been pushed to the Fedora 36 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-59c07c5ccf` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-59c07c5ccf See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2022-b1570f0952 has been pushed to the Fedora 35 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-b1570f0952` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-b1570f0952 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2022-22af30e355 has been pushed to the Fedora 34 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-22af30e355` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-22af30e355 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2022-59c07c5ccf has been pushed to the Fedora 36 stable repository. If problem still persists, please make note of it in this bug report. FEDORA-2022-22af30e355 has been pushed to the Fedora 34 stable repository. If problem still persists, please make note of it in this bug report. FEDORA-2022-b1570f0952 has been pushed to the Fedora 35 stable repository. If problem still persists, please make note of it in this bug report. |