Bug 1260862 - Illegal sequences of space-space in /dev/lirc0 output [NEEDINFO]
Summary: Illegal sequences of space-space in /dev/lirc0 output
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 27
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-09-08 05:53 UTC by Alec Leamas
Modified: 2018-08-29 15:10 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-29 15:10:27 UTC
Type: Bug
Embargoed:
jforbes: needinfo?


Attachments (Terms of Use)
bisect log (4.32 KB, text/plain)
2016-01-15 09:10 UTC, Alec Leamas
no flags Details
Conveninece copy of commit (1.77 KB, patch)
2016-01-15 09:11 UTC, Alec Leamas
no flags Details | Diff
Patch reverting the culprit. (1.80 KB, patch)
2016-01-24 16:17 UTC, Alec Leamas
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Linux Kernel 102971 0 None None None Never

Description Alec Leamas 2015-09-08 05:53:48 UTC
Description of problem:

When running the 4.04 kernel the output from /dev/lirc0 contains sequences like

space 16777215
space 2030697
pulse 2700
space 800 

I. e., after the long space which defines a sync there is a yet another space. This breaks the device semantics, which is a sequence of space/pulse durations. Two consecutive spaces just doesn't make any sense. It also breaks applications like irrecord. 

Version-Release number of selected component (if applicable):
kernel-4.1.6-200.fc22.x86_64


How reproducible: Always.


Steps to Reproduce:
1. Install lirc: dnf install lirc
2. Attach a USB IR dongle supported by the kernel.
3. Run mode2: mode2 --driver default --device /dev/lirc0
4. Push buttons on the IR remote, watch stdout.


Actual results:

Consecutive spaces found in output, see above

Expected results:

A clean space-pulse-space-pulse... sequence

Additional info:

kernel 3.17.4 does not exhibit this behaviour, so somewhere in between.

For Fedora, there is a walk-around in lirc-0.9.3-4, currently in testing. However, the bug will bite any user on an updated kernel without an updated lirc; from a kernel perspective this is perhaps not a large part of the Fedora users, but a substantial amount of the total users (Debian is still on lirc-0.9.0). And, its IMHO still a bug.

Comment 1 Laura Abbott 2015-09-08 19:51:48 UTC
Can you narrow down the problem a bit more? Identifying which major version introduced the change will be useful since 3.17 -> 4.1.6 is a big jump. You can check for old versions in koji. Please also give hardware information for the USB IR dongle you are using.

Comment 2 Alec Leamas 2015-10-10 01:05:24 UTC
Sorry for delay, other things around. 

Anyway, the initial data is actually not complete. The bug is present in 4.04, but is already fixed in 4.1.6. Closing.

Comment 3 Alec Leamas 2016-01-12 20:27:21 UTC
Reopening, bug is back in  4.2.8-200

Shown by the following output when dumping /dev/lirc0 when feeding it with disturbances (two pulses with a long delay)

space 16777215
pulse 50
space 16777215

space 16777215
pulse 100
space 16777215

The semantics of the output from this device is a series of pulse/space durations, so a space followed by another space does not make any sense.

Will try 4.0.4-301, but needs to reboot, so updating now.

Comment 4 Alec Leamas 2016-01-12 20:39:17 UTC
And, in 4.0.4-301 it's still there. So, it's not a clean cut. 

The hardware is a Topseed receiver. lsusb: Bus 002 Device 007: ID 1784:0001 TopSeed Technology Corp. eHome Infrared Transceiver.

Will try to setup som VM to go back in time. Stay tuned.

Comment 5 Alec Leamas 2016-01-12 23:02:29 UTC
results on f21 with existing kernels

3.17.4-301: No bug
4.1.13-100: Bug

I don't find any more f21 builds, and somehow I presume we should go to the bottom with it. I'll need to bisect that series, but havn't really the time now.

Comment 6 Alec Leamas 2016-01-15 09:09:04 UTC
Culprit: 

a8f29e8 [media] media/rc: Send sync space information on the lirc device

Attaching the bisect log and a convenience copy of the commit.

Now, this is seemingly not a plain bug but done on purpose. However, I don't understand the background - the bug talks about userspace requirements but AFAIK lircd is the only userspace tool accessing this data. And there has been no communication between the lirc project and the kernel about this.

Bottom line: this commit breaks the since long established  semantics for /dev/lirc data, and also userspace applications depending on these.Probablly, the commit should just be reverted. I'll try to get in touch with the committer.

Comment 7 Alec Leamas 2016-01-15 09:10:13 UTC
Created attachment 1115060 [details]
bisect log

Comment 8 Alec Leamas 2016-01-15 09:11:13 UTC
Created attachment 1115061 [details]
Conveninece copy of commit

Comment 9 Alec Leamas 2016-01-15 11:42:26 UTC
Got an answer from Austin Lund, the committer:
-----------------------------------------------

I did wonder if this would come up.

My issue was with a gpio based ir receiver. There you don't get any sync space between well separated events. That is, you get the last pulse and the first pulse of the next event together which breaks other things.

I'm currently on leave and only have intermittent 3g on my phone, so I'll have a look and try replying in the bug report when I return.

------------------------------------------------

Austin seem to be back in the middle of next week.

My provisionary view is that a fix for a specific hardware should not break existing kernel interfaces.

Comment 10 Alec Leamas 2016-01-24 16:17:19 UTC
Created attachment 1117621 [details]
Patch reverting the culprit.

Adding patch which solves the problem. The failing gpio driver(s) (which?) will of course fail, but as AFAICT this is a problem in the driver which must be solved therein, not swept under the carpet at this level.

Need comments from Austin on this.

Comment 11 Austin Lund 2016-01-25 00:42:02 UTC
I'm still on leave at the moment, so my attention is a little bit wayward.  But I don't have a problem with reverting the change but would really like to sort out the GPIO driver.  

The system I test on is a cubox-i from SolidRun.  Having no other system to test on, I just assumed that the rewrite of the IR code introduced this as a bug.  But it seems it's somehow deeper in the GPIO code.  

I had lots of trouble understanding the GPIO driver code, so I just submitted the patch to the author of the code and it seemed to get waved though.  I'm up for suggestions as to how to fix the issue I have with the GPIO driver.

Comment 12 Alec Leamas 2016-01-25 08:23:09 UTC
@laura: So, Austin and I seem to agree that this commit should be reverted. How do we proceed? Can you push the patch upstream, or should I post the patch at the linux-media ML? 

@Austin: I understand your problems wiht the gpio driver. However, the inner workings of it is above my paygrade, sorry.

Comment 13 Laura Abbott 2016-01-25 23:22:33 UTC
Go ahead and send the patch yourself. In the kernel directory, run scripts/get_maintainer.pl on the patch file to get the list of who to send the patch to and also Cc Austin. Also remember to add your Signed-off-by. You can cc me as well if you want (not necessary) and include a link to this bugzilla to make discussion easier.

Comment 14 Alec Leamas 2016-01-26 07:52:43 UTC
Done: https://patchwork.linuxtv.org/project/linux-media/list.

Comment 15 Mauro Carvalho Chehab 2016-02-01 15:12:44 UTC
(In reply to Alec Leamas from comment #12)
> @laura: So, Austin and I seem to agree that this commit should be reverted.
> How do we proceed? Can you push the patch upstream, or should I post the
> patch at the linux-media ML? 

Reverting a patch applied on 3.18 seems very risky as other drivers
may rely on this behavior.

I guess the best thing here would be to detect if a space was
already sent, before sending an extra space at ev.reset, e. g.
something like the following (untested) patch.

Please see if this patch would solve the issue:
    https://patchwork.linuxtv.org/patch/32800/

Comment 16 Alec Leamas 2016-07-08 18:51:34 UTC
(In reply to Mauro Carvalho Chehab from comment #15)
> 
> Please see if this patch would solve the issue:
>     https://patchwork.linuxtv.org/patch/32800/

Well, I could. But, if I get this right it wont work. The condition that the previous token is a pulse is always true when the timeout fires, this is obvious from the data. The problem is not whether the previous token was a pulse, the question is if *next* token is a space or not. Which means that we'd need to implement some kind of look-ahead which seems tricky given the overall real-time constraints.

Yes, there might be drivers "depending" on this in the sense that they give the wrong output and needs this fix - Austin Lund's GPIO driver is likely to be an example. OTOH, unless we revert the original fix *all* existing drivers giving correct output breaks. So, is here three options?

- Try to make a reasonable fix implementing look-ahead.
- Revert the original fix (my patch) and restore the function for almost all drivers + fixing the broken gpio driver. The catch is just that I'm not able to make that gpio driver fix, sorry.
- Leave things as is, deferring the problem to userspace. Which has the same problem implementing look-ahead. The current fix in upcoming lirc drops the correct space value coming after the bad sync, basically substituting the correct value with a bogus one from the kernel. It is not that satisfactory, so to speak.

Of course, I see the merits in the first option. I just don't understand how to do it.f

Comment 17 Fedora End Of Life 2016-07-19 17:49:01 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 18 Alec Leamas 2016-08-01 16:11:24 UTC
This bug is not resolved and present in f24/f35. Reopening.

Comment 19 Jan Kurik 2018-05-31 09:01:00 UTC
This bug is currently reported against a Fedora version which is already unsuported.
I am changing the version to '27', the latest supported release.

Please check whether this bug is still an issue on the '27' release.
If you find this bug not being applicable on this release, please close it.

Comment 20 Justin M. Forbes 2018-07-23 15:15:31 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 27 kernel bugs.

Fedora 27 has now been rebased to 4.17.7-100.fc27.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 28, and are still experiencing this issue, please change the version to Fedora 28.

If you experience different issues, please open a new bug report for those.

Comment 21 Justin M. Forbes 2018-08-29 15:10:27 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 5 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.


Note You need to log in before you can comment on or make changes to this bug.