Bug 1256961 - Re-activate CONFIG_SCHEDSTATS
Re-activate CONFIG_SCHEDSTATS
Status: CLOSED DEFERRED
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel (Show other bugs)
7.0
x86_64 Linux
unspecified Severity high
: rc
: ---
Assigned To: Larry Woodman
Red Hat Kernel QE team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-25 17:44 EDT by Jan Schreiber
Modified: 2017-06-29 13:30 EDT (History)
18 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Feature: Compile RHEL 7 Kernel with CONFIG_SCHEDSTATS enabled. Reason: Have detailed /proc/<PID>/sched statistics available again. Result: Same statistics available as with RHEL 6.x. This helps performing performance investigations.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-09-02 14:40:37 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jan Schreiber 2015-08-25 17:44:04 EDT
Description of problem:
Since the disabling of CONFIG_SCHEDSTATS with RHEL 7, invaluable metrics are missing in /proc/<PID>/task/<TID>/sched.


Version-Release number of selected component (if applicable):
RHEL 6.x still had these metrics. From 7.x on they are missing.

How reproducible:
# cat /proc/self/sched

Actual results:
cat (30286, #threads: 1)
-------------------------------------------------------------------
se.exec_start                                :   33010432583.086650
se.vruntime                                  :      18693974.064711
se.sum_exec_runtime                          :             3.290863
nr_switches                                  :                    2
nr_voluntary_switches                        :                    1
nr_involuntary_switches                      :                    1
se.load.weight                               :                 1024
policy                                       :                    0
prio                                         :                  120
clock-delta                                  :                  235
mm->numa_scan_seq                            :                    0
numa_migrations, 0
numa_faults_memory, 0, 0, 1, 0, -1
numa_faults_memory, 1, 0, 0, 0, -1


Expected results:
cat (100113, #threads: 1)
---------------------------------------------------------
se.exec_start                      :    1931229972.660934
se.vruntime                        :       5310022.323062
se.sum_exec_runtime                :             0.612904
se.wait_start                      :             0.000000
se.sleep_start                     :             0.000000
se.block_start                     :             0.000000
se.sleep_max                       :             0.000000
se.block_max                       :             0.000000
se.exec_max                        :             0.408455
se.slice_max                       :             0.000000
se.wait_max                        :             0.009067
se.wait_sum                        :             0.009067
se.wait_count                      :                    3
se.iowait_sum                      :           252.019812
se.iowait_count                    :                   90
sched_info.bkl_count               :                    0
se.nr_migrations                   :                    2
se.nr_migrations_cold              :                    0
se.nr_failed_migrations_affine     :                    0
se.nr_failed_migrations_running    :                    0
se.nr_failed_migrations_hot        :                    0
se.nr_forced_migrations            :                    0
se.nr_wakeups                      :                    0
se.nr_wakeups_sync                 :                    0
se.nr_wakeups_migrate              :                    0
se.nr_wakeups_local                :                    0
se.nr_wakeups_remote               :                    0
se.nr_wakeups_affine               :                    0
se.nr_wakeups_affine_attempts      :                    0
se.nr_wakeups_passive              :                    0
se.nr_wakeups_idle                 :                    0
avg_atom                           :             0.612904
avg_per_cpu                        :             0.306452
nr_switches                        :                    1
nr_voluntary_switches              :                    0
nr_involuntary_switches            :                    1
se.load.weight                     :                 1024
policy                             :                    0
prio                               :                  120
clock-delta                        :                   65


Additional info:
This is a follow up on https://bugzilla.redhat.com/show_bug.cgi?id=1013225, which was filed against Fedora.
Comment 2 Jan Kurik 2015-08-31 02:48:22 EDT
Marking this bug as public on reporter's request.
Comment 3 Larry Woodman 2015-09-02 14:36:52 EDT
We cant take this in a dot release because it changes kernel data structures and breaks the kABI:

[root@lwoodman7 rhel7]# vi .config                 //set CONFIG_SCHEDSTATS=y
[root@lwoodman7 rhel7]# make oldconfig
[root@lwoodman7 rhel7]# make -j9 bzImage
[root@lwoodman7 rhel7]# make rh-check-kabi
make -C redhat rh-check-kabi 
BUILDID is ".test". Update '/home/GIT/RHEL7.2/rhel7/localversion' to change.
TEST is "None". Update '/home/GIT/RHEL7.2/rhel7/tests' to change.
make[1]: Entering directory `/home/GIT/RHEL7.2/rhel7/redhat'
*** ERROR - ABI BREAKAGE WAS DETECTED ***

The following symbols have been changed (this will cause an ABI breakage):

dev_get_stats
invalidate_bdev
scsi_host_alloc
hci_register_dev
dev_addr_add
blk_queue_merge_bvec
vmalloc_to_page
dev_mc_add
rtnl_link_register
blk_queue_dma_alignment
fs_bio_set
eth_type_trans
ethtool_op_get_link
nf_register_hooks
blk_make_request
arp_send
kernel_getsockopt
nf_unregister_hooks
skb_tstamp_tx
skb_queue_purge
kthread_bind
clear_page_dirty_for_io
blk_queue_segment_boundary
eth_change_mtu
bdi_register_dev
dev_get_by_name
pci_enable_device
blk_queue_logical_block_size
skb_copy_expand
blk_sync_queue
end_page_writeback
scsi_device_put
__dev_get_by_name
set_disk_ro
platform_device_add
pci_bus_read_config_byte
blkdev_get
alloc_netdev_mqs
alloc_etherdev_mqs
alloc_disk
kill_block_super
framebuffer_release
consume_skb
blk_put_request
device_del
netdev_rx_handler_unregister
pci_bus_write_config_dword
vmap
__netdev_alloc_skb
register_netdevice
xt_unregister_targets
blk_queue_bounce_limit
scsi_host_lookup
dev_mc_del
bdi_init
cdev_alloc
__mmu_notifier_register
kill_pid
free_netdev
eth_validate_addr
__lock_page
napi_get_frags
find_get_pages_tag
netif_napi_add
blk_init_queue
blk_get_backing_dev_info
ipv6_skip_exthdr
module_layout
skb_pad
blk_start_queue
block_write_full_page
proc_mkdir
skb_copy_datagram_iovec
kthread_stop
__genl_register_family
bio_endio
pci_find_capability
ipmi_unregister_smi
dev_set_name
napi_gro_frags
mutex_unlock
pci_set_master
netif_device_attach
skb_push
sock_create_kern
bdget
node_data
__napi_complete
dev_kfree_skb_irq
skb_dequeue
blk_get_queue
proc_create_data
netlink_broadcast
netdev_master_upper_dev_link
blk_cleanup_queue
scsi_device_lookup
__dynamic_dev_dbg
blkdev_get_by_dev
skb_copy
kfree_skb
dev_set_mac_address
device_remove_file
blk_execute_rq_nowait
hci_alloc_dev
__register_chrdev
dev_open
pv_cpu_ops
napi_gro_receive
kernel_setsockopt
set_page_dirty
kernel_bind
__ethtool_get_settings
unlock_page
bdevname
kthread_create_on_node
dev_addr_del
nla_put
find_vma
device_register
kmem_cache_create
pci_bus_read_config_dword
blk_queue_softirq_done
skb_copy_bits
blk_execute_rq
generic_make_request
skb_make_writable
fsync_bdev
ioctl_by_bdev
module_refcount
scsi_is_sdev_device
netlink_unicast
skb_queue_head
skb_unlink
kernel_sendmsg
netdev_master_upper_dev_get
netif_rx
cdev_init
bio_alloc_bioset
skb_trim
blk_queue_bounce
dev_trans_start
block_write_begin
dev_get_by_index
platform_device_alloc
genlmsg_put
vlan_dev_vlan_id
napi_complete
blk_stop_queue
current_task
dma_supported
skb_checksum
blk_fetch_request
scsi_remove_device
pci_bus_read_config_word
bdget_disk
netif_device_detach
blkdev_get_by_path
init_task
lock_sock_nested
set_device_ro
root_device_unregister
vm_mmap
blk_queue_max_segments
dev_err
pagevec_lookup_tag
kernel_recvmsg
vlan_dev_real_dev
ref_module
__root_device_register
pskb_expand_head
unmap_mapping_range
genl_unregister_family
netdev_rx_handler_register
mmu_notifier_unregister
__skb_gso_segment
__nlmsg_put
pci_request_regions
find_or_create_page
bio_init
netif_carrier_on
__blockdev_direct_IO
hci_unregister_dev
blk_queue_make_request
device_create_file
kmalloc_caches
nla_reserve
dev_warn
unregister_netdev
skb_partial_csum_set
dev_get_drvdata
scsi_device_get
eth_mac_addr
ether_setup
netif_napi_del
__task_pid_nr_ns
sock_alloc_send_skb
__alloc_skb
unregister_netdevice_queue
scsi_remove_host
wake_up_process
sync_blockdev
__pskb_pull_tail
blkdev_issue_discard
dev_set_allmulti
truncate_inode_pages
scsi_is_fc_rport
___pskb_trim
dev_set_drvdata
pci_release_regions
proc_mkdir_mode
ipmi_register_smi
scsi_host_put
xt_register_targets
pci_disable_device
netif_carrier_off
scsi_host_set_state
blk_rq_map_kern
bio_put
dma_ops
pv_mmu_ops
skb_realloc_headroom
ioc4_unregister_submodule
blk_put_queue
mutex_lock
blk_get_request
ipmi_smi_msg_received
netif_set_real_num_tx_queues
netdev_change_features
blk_alloc_queue
kmem_cache_destroy
pci_bus_write_config_word
set_blocksize
device_create
mutex_trylock
sk_alloc
bio_add_page
inc_zone_page_state
dev_set_mtu
module_put
release_sock
get_device
register_pernet_subsys
flush_signals
register_netdev
__napi_schedule
__skb_get_hash
dev_kfree_skb_any
pagevec_lookup
aio_complete
del_gendisk
bio_get_nr_vecs
__pagevec_release
lookup_bdev
pci_bus_write_config_byte
device_destroy
platform_device_unregister
add_disk
skb_queue_tail
put_disk
__netif_schedule
submit_bio
current_fs_time
rtnl_link_unregister
wait_on_page_bit
skb_checksum_help
put_device
netdev_update_features
bdi_destroy
pci_unregister_driver
cdev_add
framebuffer_alloc
mapping_tagged
hci_free_dev
icmpv6_send
dev_close
dma_set_mask
unregister_pernet_subsys
__module_get
blk_queue_stack_limits
arp_create
blkdev_put
skb_clone
netif_rx_ni
__blk_put_request
find_get_page
sk_free
dst_release
find_module
__mutex_init
platform_device_put
vga_set_legacy_decoding
kmem_cache_alloc
bdput
ip6_route_output
dev_set_promiscuity
__dev_get_by_index
inet_proto_csum_replace4
arp_xmit
kmem_cache_free
skb_put
device_unregister
skb_pull
scsi_reset_provider
bdi_unregister
try_module_get
__pci_register_driver
pci_clear_master
_dev_info
invalidate_partition
dev_queue_xmit
bio_clone_bioset
elevator_change
kernel_sock_ioctl
blk_queue_max_hw_sectors
netif_receive_skb
netdev_features_change
put_page
dev_printk
scsi_add_host_with_dma
__skb_checksum_complete
skb_recv_datagram
mark_page_accessed
skb_dequeue_tail
init_net

make[1]: *** [rh-check-kabi] Error 1
make[1]: Leaving directory `/home/GIT/RHEL7.2/rhel7/redhat'
make: *** [rh-check-kabi] Error 2
[root@lwoodman7 rhel7]#
Comment 4 Larry Woodman 2015-09-02 14:40:37 EDT
We will have to do this for RHEL8.

Larry
Comment 5 Trinh Dao 2015-09-14 15:31:38 EDT
Larry, was there reason why this was not added in RHEL7?
Comment 6 Akemi Yagi 2015-10-26 13:21:38 EDT
I think the reason was given in comment #3:

"We cant take this in a dot release because it changes kernel data structures and breaks the kABI"
Comment 8 Akemi Yagi 2017-02-27 14:51:24 EST
Looks like this was done without waiting for RHEL 8. RHEL 7.3 kernels (3.10.0-514) have CONFIG_SCHEDSTATS=y .
Comment 9 Jan Schreiber 2017-02-28 07:32:49 EST
Interesting... Though CONFIG_SCHEDSTATS=y, the output of /proc/<PID>/sched is still limited... 

Looking at the source, I see that sched_schedstats is checked as well, so a
# sysctl kernel.sched_schedstats=1
brought back our invaluable scheduling statistics.

Yes!

Tested on RHEL 7.3 with 3.10.0-568.

Note You need to log in before you can comment on or make changes to this bug.