1. 29 9月, 2011 3 次提交
  2. 11 8月, 2011 1 次提交
    • N
      blktrace: add FLUSH/FUA support · c09c47ca
      Namhyung Kim 提交于
      Add FLUSH/FUA support to blktrace. As FLUSH precedes WRITE and/or
      FUA follows WRITE, use the same 'F' flag for both cases and
      distinguish them by their (relative) position. The end results
      look like (other flags might be shown also):
      
       - WRITE:            W
       - WRITE_FLUSH:      FW
       - WRITE_FUA:        WF
       - WRITE_FLUSH_FUA:  FWF
      
      Note that we reuse TC_BARRIER due to lack of bit space of act_mask
      so that the older versions of blktrace tools will report flush
      requests as barriers from now on.
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      c09c47ca
  3. 31 7月, 2011 1 次提交
  4. 26 7月, 2011 1 次提交
    • J
      xen/tracing: fix compile errors when tracing is disabled. · b3c4b982
      Jeremy Fitzhardinge 提交于
      When CONFIG_FUNCTION_TRACER is disabled, compilation fails as follows:
        CC      arch/x86/xen/setup.o
      In file included from arch/x86/include/asm/xen/hypercall.h:42,
                       from arch/x86/xen/setup.c:19:
      include/trace/events/xen.h:31: warning: 'struct multicall_entry' declared inside parameter list
      include/trace/events/xen.h:31: warning: its scope is only this definition or declaration, which is probably not what you want
      include/trace/events/xen.h:31: warning: 'struct multicall_entry' declared inside parameter list
      include/trace/events/xen.h:31: warning: 'struct multicall_entry' declared inside parameter list
      include/trace/events/xen.h:31: warning: 'struct multicall_entry' declared inside parameter list
      [...]
      arch/x86/xen/trace.c:5: error: '__HYPERVISOR_set_trap_table' undeclared here (not in a function)
      arch/x86/xen/trace.c:5: error: array index in initializer not of integer type
      arch/x86/xen/trace.c:5: error: (near initialization for 'xen_hypercall_names')
      arch/x86/xen/trace.c:6: error: '__HYPERVISOR_mmu_update' undeclared here (not in a function)
      arch/x86/xen/trace.c:6: error: array index in initializer not of integer type
      arch/x86/xen/trace.c:6: error: (near initialization for 'xen_hypercall_names')
      
      Fix this by making sure struct multicall_entry has a declaration in
      scope at all times, and don't bother compiling xen/trace.c when tracing
      is disabled.
      Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      b3c4b982
  5. 20 7月, 2011 1 次提交
  6. 19 7月, 2011 9 次提交
  7. 11 7月, 2011 3 次提交
  8. 10 7月, 2011 2 次提交
    • W
      writeback: trace global_dirty_state · e1cbe236
      Wu Fengguang 提交于
      Add trace event balance_dirty_state for showing the global dirty page
      counts and thresholds at each global_dirty_limits() invocation.  This
      will cover the callers throttle_vm_writeout(), over_bground_thresh()
      and each balance_dirty_pages() loop.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      e1cbe236
    • W
      writeback: make writeback_control.nr_to_write straight · d46db3d5
      Wu Fengguang 提交于
      Pass struct wb_writeback_work all the way down to writeback_sb_inodes(),
      and initialize the struct writeback_control there.
      
      struct writeback_control is basically designed to control writeback of a
      single file, but we keep abuse it for writing multiple files in
      writeback_sb_inodes() and its callers.
      
      It immediately clean things up, e.g. suddenly wbc.nr_to_write vs
      work->nr_pages starts to make sense, and instead of saving and restoring
      pages_skipped in writeback_sb_inodes it can always start with a clean
      zero value.
      
      It also makes a neat IO pattern change: large dirty files are now
      written in the full 4MB writeback chunk size, rather than whatever
      remained quota in wbc->nr_to_write.
      Acked-by: NJan Kara <jack@suse.cz>
      Proposed-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      d46db3d5
  9. 06 7月, 2011 1 次提交
  10. 25 6月, 2011 2 次提交
    • L
      jbd: Add fixed tracepoints · 99cb1a31
      Lukas Czerner 提交于
      This commit adds fixed tracepoint for jbd. It has been based on fixed
      tracepoints for jbd2, however there are missing those for collecting
      statistics, since I think that it will require more intrusive patch so I
      should have its own commit, if someone decide that it is needed. Also
      there are new tracepoints in __journal_drop_transaction() and
      journal_update_superblock().
      
      The list of jbd tracepoints:
      
      jbd_checkpoint
      jbd_start_commit
      jbd_commit_locking
      jbd_commit_flushing
      jbd_commit_logging
      jbd_drop_transaction
      jbd_end_commit
      jbd_do_submit_data
      jbd_cleanup_journal_tail
      jbd_update_superblock_end
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NJan Kara <jack@suse.cz>
      99cb1a31
    • L
      ext3: Add fixed tracepoints · 785c4bcc
      Lukas Czerner 提交于
      This commit adds fixed tracepoints to the ext3 code. It is based on ext4
      tracepoints, however due to the differences of both file systems, there
      are some tracepoints missing (those for delaloc and for multi-block
      allocator) and there are some ext3 specific as well (for reservation
      windows).
      
      Here is a list:
      
      ext3_free_inode
      ext3_request_inode
      ext3_allocate_inode
      ext3_evict_inode
      ext3_drop_inode
      ext3_mark_inode_dirty
      ext3_write_begin
      ext3_ordered_write_end
      ext3_writeback_write_end
      ext3_journalled_write_end
      ext3_ordered_writepage
      ext3_writeback_writepage
      ext3_journalled_writepage
      ext3_readpage
      ext3_releasepage
      ext3_invalidatepage
      ext3_discard_blocks
      ext3_request_blocks
      ext3_allocate_blocks
      ext3_free_blocks
      ext3_sync_file_enter
      ext3_sync_file_exit
      ext3_sync_fs
      ext3_rsv_window_add
      ext3_discard_reservation
      ext3_alloc_new_reservation
      ext3_reserved
      ext3_forget
      ext3_read_block_bitmap
      ext3_direct_IO_enter
      ext3_direct_IO_exit
      ext3_unlink_enter
      ext3_unlink_exit
      ext3_truncate_enter
      ext3_truncate_exit
      ext3_get_blocks_enter
      ext3_get_blocks_exit
      ext3_load_inode
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NJan Kara <jack@suse.cz>
      785c4bcc
  11. 22 6月, 2011 2 次提交
  12. 16 6月, 2011 2 次提交
    • K
      vmscan: implement swap token priority aging · d7911ef3
      KOSAKI Motohiro 提交于
      While testing for memcg aware swap token, I observed a swap token was
      often grabbed an intermittent running process (eg init, auditd) and they
      never release a token.
      
      Why?
      
      Some processes (eg init, auditd, audispd) wake up when a process exiting.
      And swap token can be get first page-in process when a process exiting
      makes no swap token owner.  Thus such above intermittent running process
      often get a token.
      
      And currently, swap token priority is only decreased at page fault path.
      Then, if the process sleep immediately after to grab swap token, the swap
      token priority never be decreased.  That's obviously undesirable.
      
      This patch implement very poor (and lightweight) priority aging.  It only
      be affect to the above corner case and doesn't change swap tendency
      workload performance (eg multi process qsbench load)
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7911ef3
    • K
      vmscan: implement swap token trace · 83cd81a3
      KOSAKI Motohiro 提交于
      This is useful for observing swap token activity.
      
      example output:
      
                   zsh-1845  [000]   598.962716: update_swap_token_priority:
      mm=ffff88015eaf7700 old_prio=1 new_prio=0
                memtoy-1830  [001]   602.033900: update_swap_token_priority:
      mm=ffff880037a45880 old_prio=947 new_prio=949
                memtoy-1830  [000]   602.041509: update_swap_token_priority:
      mm=ffff880037a45880 old_prio=949 new_prio=951
                memtoy-1830  [000]   602.051959: update_swap_token_priority:
      mm=ffff880037a45880 old_prio=951 new_prio=953
                memtoy-1830  [000]   602.052188: update_swap_token_priority:
      mm=ffff880037a45880 old_prio=953 new_prio=955
                memtoy-1830  [001]   602.427184: put_swap_token:
      token_mm=ffff880037a45880
                   zsh-1789  [000]   602.427281: replace_swap_token:
      old_token_mm=          (null) old_prio=0 new_token_mm=ffff88015eaf7018
      new_prio=2
                   zsh-1789  [001]   602.433456: update_swap_token_priority:
      mm=ffff88015eaf7018 old_prio=2 new_prio=4
                   zsh-1789  [000]   602.437613: update_swap_token_priority:
      mm=ffff88015eaf7018 old_prio=4 new_prio=6
                   zsh-1789  [000]   602.443924: update_swap_token_priority:
      mm=ffff88015eaf7018 old_prio=6 new_prio=8
                   zsh-1789  [000]   602.451873: update_swap_token_priority:
      mm=ffff88015eaf7018 old_prio=8 new_prio=10
                   zsh-1789  [001]   602.462639: update_swap_token_priority:
      mm=ffff88015eaf7018 old_prio=10 new_prio=12
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: Rik van Riel<riel@redhat.com>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83cd81a3
  13. 15 6月, 2011 1 次提交
    • S
      rcu: Use softirq to address performance regression · 09223371
      Shaohua Li 提交于
      Commit a26ac245(rcu: move TREE_RCU from softirq to kthread)
      introduced performance regression. In an AIM7 test, this commit degraded
      performance by about 40%.
      
      The commit runs rcu callbacks in a kthread instead of softirq. We observed
      high rate of context switch which is caused by this. Out test system has
      64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
      which is caused by RCU's per-CPU kthread.  A trace showed that most of
      the time the RCU per-CPU kthread doesn't actually handle any callbacks,
      but instead just does a very small amount of work handling grace periods.
      This means that RCU's per-CPU kthreads are making the scheduler do quite
      a bit of work in order to allow a very small amount of RCU-related
      processing to be done.
      
      Alex Shi's analysis determined that this slowdown is due to lock
      contention within the scheduler.  Unfortunately, as Peter Zijlstra points
      out, the scheduler's real-time semantics require global action, which
      means that this contention is inherent in real-time scheduling.  (Yes,
      perhaps someone will come up with a workaround -- otherwise, -rt is not
      going to do well on large SMP systems -- but this patch will work around
      this issue in the meantime.  And "the meantime" might well be forever.)
      
      This patch therefore re-introduces softirq processing to RCU, but only
      for core RCU work.  RCU callbacks are still executed in kthread context,
      so that only a small amount of RCU work runs in softirq context in the
      common case.  This should minimize ksoftirqd execution, allowing us to
      skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      Tested-by: N"Alex,Shi" <alex.shi@intel.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      09223371
  14. 08 6月, 2011 4 次提交
  15. 06 6月, 2011 1 次提交
  16. 03 6月, 2011 1 次提交
    • K
      net: tracepoint of net_dev_xmit sees freed skb and causes panic · ec764bf0
      Koki Sanagi 提交于
      Because there is a possibility that skb is kfree_skb()ed and zero cleared
      after ndo_start_xmit, we should not see the contents of skb like skb->len and
      skb->dev->name after ndo_start_xmit. But trace_net_dev_xmit does that
      and causes panic by NULL pointer dereference.
      This patch fixes trace_net_dev_xmit not to see the contents of skb directly.
      
      If you want to reproduce this panic,
      
      1. Get tracepoint of net_dev_xmit on
      2. Create 2 guests on KVM
      2. Make 2 guests use virtio_net
      4. Execute netperf from one to another for a long time as a network burden
      5. host will panic(It takes about 30 minutes)
      Signed-off-by: NKoki Sanagi <sanagi.koki@jp.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec764bf0
  17. 26 5月, 2011 2 次提交
  18. 20 5月, 2011 1 次提交
  19. 12 5月, 2011 1 次提交
  20. 06 5月, 2011 1 次提交