Bug 632236 - confused old_Hertz_hack - Unknown HZ value! (93) Assume 1024
Summary: confused old_Hertz_hack - Unknown HZ value! (93) Assume 1024
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: procps
Version: rawhide
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Jan Görig
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 636738 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-09 13:31 UTC by Yanko Kaneti
Modified: 2018-04-11 09:11 UTC (History)
24 users (show)

Fixed In Version: procps-3.2.8-12.fc14
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-09-23 15:03:55 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
ps strace (4.18 KB, text/plain)
2010-09-09 13:31 UTC, Yanko Kaneti
no flags Details

Description Yanko Kaneti 2010-09-09 13:31:18 UTC
Created attachment 446248 [details]
ps strace

Description of problem:
"Unknown HZ value! (93) Assume 1024" on some procps commands (ps,uptime..)

93 is sometimes 85

Version-Release number of selected component (if applicable):
procps-3.2.8-11.fc14.x86_64
kernel-2.6.36-0.18.rc3.git1.fc15.x86_64

Attaching the strace of the issue which has the full contents of /proc/stat and /proc/utpime at that time  , as the uptime increased the problem went away.

Comment 1 Vaclav "sHINOBI" Misek 2010-09-10 18:39:12 UTC
I can see: Unknown HZ value! (94) Assume 1024.

Unknown HZ value! (94) Assume 1024.

I think it appeared with procps-3.2.8-11.fc14.x86_64. I'm using latest Fedora 14.

Comment 2 Evan Klitzke 2010-09-10 20:03:43 UTC
I found this, which seems to describe the same problem: http://lkml.indiana.edu/hypermail/linux/kernel/0202.2/0403.html

It's an awfully old thread though, so I'm not sure whether it's really the same issue.

Comment 3 Adam Williamson 2010-09-12 19:20:36 UTC
I suspect this is more likely a kernel issue than a procps issue, re-assigning. dmesg output may be useful, and testing with 'nohz=off' and 'clocksource=acpi_pm' parameters. Do you see any problems that seem to be related to the message or is it just an odd message you noticed and thought to report?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 4 Vaclav "sHINOBI" Misek 2010-09-12 21:02:47 UTC
I'm not sure if it's a kernel problem, maybe yes, but maybe new procps just made it visible. I'm using 2.6.35.4-12.fc14.x86_64 and there were no such messages after installing this kernel.

Comment 5 Adam Williamson 2010-09-12 22:01:41 UTC

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 6 Evan Klitzke 2010-09-13 03:14:11 UTC
I'm in the habit of shutting off my computer every night, and booting up in the morning, so I'm confident that this message happened as a result of a procps upgrade, not a kernel upgrade (I'm on 2.6.35.4-12.fc14.x86_64, fwiw). AFAICT, this is just an annoying message.

The default ~/.bashrc sources /etc/bashrc which ends up invoking ps at some point; that means that whenever a start up a new terminal, I see this error message printed out as the first thing in the terminal. Other than that minor annoyance, I haven't actually observed any real problems.

Comment 7 Adam Williamson 2010-09-13 05:47:35 UTC
the fact that it arrived with a procps update doesn't mean it's a 'bug' in procps; it could just be a new informational message that it didn't print before. Given the content, which is talking about kernel timer ticks as I mentioned, I'm still pretty sure it has something to do with the kernel; it'd  be nice if a kernel dev could look in :) Chuck?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 8 Jonathan Kamens 2010-09-13 17:57:39 UTC
(In reply to comment #3)
> I suspect this is more likely a kernel issue than a procps issue, re-assigning.
> dmesg output may be useful, and testing with 'nohz=off' and
> 'clocksource=acpi_pm' parameters. Do you see any problems that seem to be
> related to the message or is it just an odd message you noticed and thought to
> report?

Resetting needinfo? flag from comment #3, since Adam hasn't been answered yet.

Comment 9 Jonathan Kamens 2010-09-13 17:58:15 UTC
Interestingly, the message seems to go away when my machine has been up for a day or so.

Comment 10 Chuck Ebbert 2010-09-14 15:56:16 UTC
I found this, a fix for that problem from Aug 2008; it says it will be in the next procps but it's not. And it's not in ours either.

http://www.mail-archive.com/debian-bugs-dist@lists.debian.org/msg553706.html

Comment 11 Chuck Ebbert 2010-09-14 16:05:31 UTC
I'm still not sure what's going on there. Is it failing to find the AT_CLKTCK ELF note and falling back to the old hack? Or is it finding it with some strange value?

Comment 12 Michal Schmidt 2010-09-17 12:54:46 UTC
From procps's proc/sysinfo.c:init_libproc():

static void init_libproc(void){
...
  if(linux_version_code > LINUX_VERSION(2, 4, 0)){ 
    Hertz = find_elf_note(AT_CLKTCK);
    if(Hertz!=NOTE_NOT_FOUND) return;
    fputs("2.4+ kernel w/o ELF notes? -- report this\n", stderr);
  }
  old_Hertz_hack();
}

If it did not detect the ELF note, it would print the message "2.4+ kernel w/o ELF notes? -- report this\n".
So the linux_version_code test must have failed. How could it fail? The function is not called explicitely, it is just declated as a constructor:

static void init_libproc(void) __attribute__((constructor));

And so is the function which sets linux_version_code:

static void init_Linux_version(void) __attribute__((constructor));

I bet the constructors are called in an unexpected (to the author) order.

Really, why depend on constructors instead of plain simple exactly defined function calls?

Comment 13 Gene Snider 2010-09-18 19:52:25 UTC
I'm not sure if this helps or not, but I've never seen this message on kernel-2.6.35.4-12.  I do see it whenever I boot into kernel-2.6.35.4-28, and I'm pretty sure I saw it in the -25 kernel as well.  This is on my F14 laptop.

Gene

Comment 14 Michal Schmidt 2010-09-20 09:50:28 UTC
old_Hertz_hack() is simply buggy. For instance:
 - It does not take iowait time into account, so if you have I/O load,
   it underestimates HZ.
 - It does not account for the possibility of CPUs going offline -
   in such a case it overestimates HZ.

But the main point stands: old_Hertz_hack() should not be called *at all*.
The bug is due to undefined execution order of constructors,
which may change with linking order.
A minimal fix would assign priorities to the constructors for their order
to be defined. An optimal fix would get rid of the usage of constructors.

Comment 15 Michal Schmidt 2010-09-20 10:24:25 UTC
BTW, I filed bug 635607 for make, as I believe the link order change is unexpected in GNU make 3.82.
But procps has to be fixed in any case.

Comment 16 Roman Rakus 2010-09-23 08:48:27 UTC
*** Bug 636738 has been marked as a duplicate of this bug. ***

Comment 17 cyrushmh 2010-09-23 08:59:03 UTC
I found this http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=460331
but my En not good....

Comment 18 Jan Görig 2010-09-23 14:23:21 UTC
Thank you for your comments. This issue should be fixed in procps-3.2.8-12.fc15 by a little modified Debian patch. Could you try it, please? I am not able to reproduce this bug.

Comment 19 Michal Schmidt 2010-09-23 14:55:44 UTC
procps-3.2.8-12.fc15 works fine for me and the patch looks fine.

Comment 20 Jan Görig 2010-09-23 15:03:55 UTC
Ok, closing.

Comment 21 Michal Schmidt 2010-09-23 15:47:19 UTC
It also affects F-14 (procps-3.2.8-11.fc14). Please issue an update. Thanks.

Comment 22 Fedora Update System 2010-09-23 16:41:12 UTC
procps-3.2.8-12.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/procps-3.2.8-12.fc14

Comment 23 Fedora Update System 2010-09-25 05:32:23 UTC
procps-3.2.8-12.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.