Handbrake x264 encodes are drastically slower when using F17 with kernel 3.5.x than they were with kernel 3.4.x. I use handbrake to encode HD video with some pretty high x264 settings (x264 options displayed in the 'advanced' tab in handbrake are: 'mixed-refs=0:weightb=0:me=umh:ref=6:subme=8:merange=24:b-adapt=2:bframes=5'). I do a 2-pass encode with an average bitrate of 10,000kbps. My machine is a quad-core core i5 and running Fedora 17. Typically, when encoding a video with these settings I get somewhere from 7-15 fps, typically around 9-10 average fps. I've noticed significant slow downs with encoding ever since Fedora updated to the 3.5.x kernel -- it may just be coincidence, but I never had such slow downs before. An encode with the same settings now goes at about 1 fps. I was running Handbrake 0.9.6, and I've since updated to 0.9.8 but I still see the slow performance when running kernel 3.5.x. When I boot into 3.4.6-2.fc17.x86_64 the performance is back to normal. The ghb process takes up ~390%+ of processor (4-core) in both cases (this is expected). I've had this problem with two different 3.5 kernels: kernel-3.5.0-2.fc17.x86_64 kernel-3.5.1-1.fc17.x86_64 I understand that Handbrake isn't part of Fedora, but it seems to be a very reproducible method for demonstrating a performance regression.
I'm not on Fedora but Arch Linux x64, but since this is the only appropriate bug report I could find, here it goes: I can confirm this with compressing videos directly with ffmpeg: ffmpeg -i input.mkv -threads 0 -c:v libx264 -preset medium -crf 20 -c:a copy -c:s copy -map 0 output.mkv with a DVD-Rip as input source this gives me around 94fps on kernel 3.4.9 but only about 12fps on kernel 3.5.3. Everything else seems normal, ffmpeg uses around 390% CPU, all cores are running at full frequency (3.3 GHz). I'm using an Intel Core i5 2500. This is a pretty serious regression, so I'm quite surprised that I can't find any other reports about this... !?
I'm still seeing this on the latest f17 kernel, kernel-3.5.3-1.fc17.x86_64 An encode actually started out at "normal" speeds (around 10 fps), and then dropped to 1 fps and stayed at that rate for quite a while before I stopped it.
(In reply to comment #1) > I'm not on Fedora but Arch Linux x64, but since this is the only appropriate > bug report I could find, here it goes: <snip> > I'm using an Intel Core i5 2500. > > This is a pretty serious regression, so I'm quite surprised that I can't > find any other reports about this... !? I see this issue on Arch x86_64 as well. h.264 encodes with HandBrake slow to 1-2 fps on the 2nd pass when using 3.5 series kernel. When I roll back to a 3.4 series kernel, I get speeds in the 20-25 fps range. This issue only seems to affect my i5-3450S (Ivy Bridge). I get the same encode speeds regardless of the kernel used on my i3-2105 (Sandy Bridge), which is about 10-15 fps on the 2nd pass.
I didn't mention this in my initial report, but I'm running an Intel Core i5 2500 (same as comment #1)
I wonder if it could be the same problem (or similar) as this http://lists.libav.org/pipermail/libav-devel/2012-July/031698.html This got fixed in this commit http://git.libav.org/?p=libav.git;a=commit;h=30b45d9 We first started seeing this problem when some code was converted from inline asm to yasm. But maybe it existed before this as well and was just not being provoked due to different timing.
I tried libav, the latest stable 0.8.3 and compiled from git - both show the same problem... :(
libav stable 0.8.3 is from before the inline asm code change that provoked the problem that we saw. And git head has the fix the restored the performance for us. So it's not the exact same problem, but I still wonder if it is similar in some way. Perhaps something in the kernel is doing the same thing that this error in libav did.
I have a theory, but not really enough information to back it up. So maybe someone can shoot this down to rule it out. I have upgraded 2 physical machines and 1 VM to latest ubuntu 12.10 or Fedora 17 and I have not seen this problem on any of them. So there must be some other common factor other than the kernel version. I'm wondering if this is a driver issue. Specifically I'm thinking intel graphics driver. So is there anyone that has this problem and is running with non-intel graphics? And is there anyone that could pop an alternate graphics card in a machine that exhibits this problem (and change drivers) and confirm or rule out this theory?
The machine I'm experiencing this on is running with an nvidia graphics card running F17. Which kernel(s) were you using that didn't exhibit the behavior?
Fedora 17 Linux version 3.6.3-1.fc17.x86_64 Ubuntu 12.10 Linux version 3.5.0-17-generic
I'm still able to reproduce on 3.6.6-1.fc17.x86_64. Interestingly (to me), the encode sometimes takes a while before it slows down -- my test was going for about 30 minutes before it dropped down to encode at ~1-2 fps (from about 11). @John, what is your test case that you weren't able to reproduce on the newer 3.6 kernel?
(In reply to comment #11) > I'm still able to reproduce on 3.6.6-1.fc17.x86_64. Interestingly (to me), > the encode sometimes takes a while before it slows down -- my test was going > for about 30 minutes before it dropped down to encode at ~1-2 fps (from > about 11). > > @John, what is your test case that you weren't able to reproduce on the > newer 3.6 kernel? Kernel is 3.6.6-1.fc17.x86_64 lsmod Module Size Used by udf 89486 0 crc_itu_t 12579 1 udf des_generic 21343 0 md4 12595 0 nls_utf8 12557 1 cifs 310441 0 dns_resolver 13096 1 cifs fscache 55155 1 cifs fuse 77993 3 lockd 93476 0 sunrpc 255586 1 lockd rfcomm 68884 4 bnep 19584 2 vboxpci 23157 0 vboxnetadp 25670 0 vboxnetflt 23442 0 vboxdrv 296272 3 vboxnetadp,vboxnetflt,vboxpci snd_hda_codec_hdmi 36232 4 iTCO_wdt 13480 0 iTCO_vendor_support 13419 1 iTCO_wdt uvcvideo 80922 0 videobuf2_vmalloc 12967 1 uvcvideo videobuf2_memops 13354 1 videobuf2_vmalloc videobuf2_core 34237 1 uvcvideo videodev 119976 2 uvcvideo,videobuf2_core media 20444 2 uvcvideo,videodev coretemp 13393 0 kvm_intel 132762 0 btusb 23870 0 bluetooth 317602 24 bnep,btusb,rfcomm kvm 419458 1 kvm_intel arc4 12615 2 microcode 23446 0 snd_hda_codec_realtek 78081 1 snd_hda_intel 37618 2 snd_hda_codec 126889 3 snd_hda_codec_realtek,snd_hda_codec_hdmi,snd_hda_intel ath9k 145017 0 serio_raw 13375 0 snd_hwdep 17650 1 snd_hda_codec ath9k_common 13602 1 ath9k snd_seq 64844 0 ath9k_hw 381816 2 ath9k_common,ath9k snd_seq_device 14136 1 snd_seq ath 23103 3 ath9k_common,ath9k,ath9k_hw snd_pcm 97459 3 snd_hda_codec_hdmi,snd_hda_codec,snd_hda_intel mac80211 550587 1 ath9k i7core_edac 24220 0 edac_core 56455 1 i7core_edac lpc_ich 17061 0 cfg80211 196846 3 ath,ath9k,mac80211 mfd_core 13145 1 lpc_ich r8169 67600 0 mei 75376 0 mii 13527 1 r8169 snd_page_alloc 18101 2 snd_pcm,snd_hda_intel snd_timer 28690 2 snd_pcm,snd_seq snd 79242 13 snd_hda_codec_realtek,snd_hwdep,snd_timer,snd_hda_codec_hdmi,snd_pcm,snd_seq,snd_hda_codec,snd_hda_intel,snd_seq_device soundcore 14491 1 snd asus_laptop 24722 0 sparse_keymap 13526 1 asus_laptop rfkill 21384 6 cfg80211,asus_laptop,bluetooth input_polldev 13646 1 asus_laptop uinput 17673 0 crc32c_intel 12901 0 nouveau 891090 4 mxm_wmi 12823 1 nouveau wmi 18697 2 mxm_wmi,nouveau video 18936 1 nouveau i2c_algo_bit 13257 1 nouveau drm_kms_helper 44701 1 nouveau ttm 79760 1 nouveau drm 255010 6 ttm,drm_kms_helper,nouveau i2c_core 38314 5 drm,drm_kms_helper,i2c_algo_bit,nouveau,videodev As test cases, I've encoded both a dvd and a bd iso. Source iso is on local hdd and destination file is also on local hdd. I let the encodes complete, so ~20 minutes in the dvd case and ~2 hrs in the bd case. I don't see any significant decrease in encoding speed during that time. I'm using the "Normal" preset. dvd encodes at around 150fps and bd encodes at around 20fps. I've run these same tests on a ubuntu 12.10 installation with the similar results (much faster machine, 70fps bd encode). Kernel 3.5.0-17-generic.
anyone else running a RAID? I'm running an encrypted RAID 5 with 4 x 2TB and it seems like the slowdown only happens when the RAID is involved. If either the input or the output file (or both) is on the RAID, I get the slowdown. If however the output and input file are on both my SSD, everything seems to work fine!? The speed on the RAID seems to be fine, "hdparm -t" on the LUKS mapper device gives me > 200MB/s. On the RAID device itself even > 300MB/s.
Interesting. I am running software RAID and it is typically the home of both my source and destination files when encoding. I'll try something isolated to a non-RAID device (will be SSD as well in my case).
I second that this is an interesting development. On my i5-3450S (Ivy Bridge) machine both the source and destination files are on a software RAID 5 (mdadm) consisting of 4 x 3TB HDD's. This machine sees the slowdown. On my i3-2105 (Sandy Bridge) system the source files are on a NFS share and the destination file also resides on an NFS share. The NFS shares are on the i5-3450S (Ivy Bridge) system and the i3-2105 (Sandy Bridge) does not see the slowdown. I'll try to isolate and see what happens as well.
I was able to narrow this down in a test so that I also only saw the slow down when I was using the RAID device. Running kernel-3.6.8-2.fc17.x86_64 on the same system, I copied the source video file to two location: 1) SSD drive 2) RAID partition (software RAID 5) I was able to encode the file with the source/destination on the SSD without any slowdowns. Then, encoding the file on the RAID partition, it eventually slowed way down and stayed at that slow speed as described before. I was encoding with both the source and destination on the RAID device, so I'm not sure if it's one or the other or both that's at fault. I could test this further to try to narrow it down more if needed. To clarify a bit on my initial post, I don't always see the slow down in speed right away, it sometimes takes 30 minutes or an hour or more of encoding before it slows down, but once it slows down, it never increases above ~1-2fps encoding speed.
In my testing on my i3-2105 (Sandy Bridge): Linux desktop 3.6.8-1-ARCH #1 SMP PREEMPT Mon Nov 26 22:10:40 CET 2012 x86_64 GNU/Linux Source file is a 30 minute MPEG2 recording (animated) made by MythTV 0.25-fixes. HandBrakeCLI command line: HandBrakeCLI -i <mpeg2 file> -o speed_test.mkv -f mkv -m -e x264 --x264-tune animation -x b-adapt=2:rc-lookahead=50 -b 5000 -2 -T -w 1920 -X 1920 -Y 1080 -l 1080 -r 29.97 --cfr --crop=0:0:0:0 -d slow --strict-anamorphic -a 1 -E copy -s 1 1) Both source and destination on NFS share: no slowdown 2) Both source and destination on localhost software (mdadm) RAID 1 (2 x 1.5TB HDD): no slowdown 3) Both source and destination on localhost SSD (no RAID): no slowdown. My i5-3450S (Ivy Bridge) system I have been unable to test on because I use an older "good" kernel (3.4.12) and can't update to a recent kernel because it's in production. Does this slowdown happen only on i5 or i5,i7 CPUs? Does anyone have the slowdown on i3 or AMD CPUs? Here are the flags on my CPU's, as reported by /proc/cpuinfo: Intel(R) Core(TM) i3-2105 CPU @ 3.10GHz: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave avx lahf_lm arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid Intel(R) Core(TM) i5-3450S CPU @ 2.80GHz: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms The i5 has all the same as i3 with the addition of: aes f16c rdrand ida fsgsbase smep erms What these mean can be found here: http://www.acsu.buffalo.edu/~charngda/x86cpufeatures_linux.html *IF* I'm on the right track could it be Intel's Turbo Boost technology somehow mucking things up? My i3 does not have it, but my i5 does.
Sorry ahead of time if this doesn't meet the requirements for a good post, but I wanted to add some additional info: I had the same machine on Ubuntu 12.02 and saw no issues, when I went to 12.10 I have the slowdown. It's slow right out of the gate and stays there. I've been running the same soft-RAID (mdadm) *5* the whole time. The two bits of info I see as relevant - I'm running an i3 (2100t I believe - the low power one), so I don't think it's specific to i5/7, and I'm running an RAID5. The other guy posting he didn't see a slowdown was on a two disc raid making it unlikely to be a RAID5. Happy to post any specs anyone wants to see, just tell me how to make my computer spit them out. About the only bit of interesting softwares I'm running: HomerunHD tuner Classic Menu Indicator Pithos Subsonic All of which I was running on the old system as well. -Abe.
Quick bit of additional info: I ran compression all four ways: RAID->RAID, SSD->RAID, RAID->SSD, SSD->SSD All of them were slow. Turned off power savings in the bios, turned off speed step (which grayed out Turbo Boost), turned off GPU Render Standby... then my computer stopped booting properly (outside the scope of this bug report but pretty frustrating personally)... I'd tried it in the past though, so the results remain valid. Also tried something called "VMM" which sounds like a virtualization thing. The point is, none of that made any significant difference in the compression speeds. 1st pass, 1 CPU is at ~90%, the others at ~30% each. 2nd pass, they are all right around 100% - John would know better but I suspect the first pass doesn't lend itself to threading well. Basically, it all looks just how it did when it was running full speed, but pretty consistently a factor of 5-7 slower (7-9 fps verses 25). I noticed the change immediately on upgrade from 12.04 to 12.10. I remember a similar slowdown years back which went away "on its own" likely due to a system update.
I just upgraded to kernel 3.8-rc2 and it has been running for an hour without slowdown, so I'm pretty sure it has been fixed. Can anyone confirm this?
btw, kernel 3.7.3 seems to work fine as well. I thought I already tested 3.7, maybe it was fixed with one of the 3.7.x releases or I just remember it wrong... ;)
I just installed Arch Linux's latest kernel on my problem i5 machine: $ uname -a Linux mbe 3.7.5-1-ARCH #1 SMP PREEMPT Mon Jan 28 10:03:32 CET 2013 x86_64 GNU/Linux Intel(R) Core(TM) i5-3450S CPU @ 2.80GHz and the problem does not exist anymore for me as well. Mike
If 3.7 is fixing it, I am going to close this out. please reopen if you are seeing the slowdown with 3.7.5 or newer.